I'm implementing a bytecode VM and am struggling referencing data stored in a parsed representation of the bytecode. As is the nature of (most) bytecode, it and thus its parsed representation remain unmodified once it's initialized. A separate Vm contains the mutable parts (stack etc.) along with that module. I made an MCVE with additional explanatory comments to illustrate the problem; it's at the bottom and on the playground. The parsed bytecode may look like this:
Module { struct_types: {"Bar": StructType::Named(["a", "b"])} }
The strings "Bar", "a", "b" are references into the bytecode and have lifetime 'b, so we also have lifetimes in the types Module<'b> and StructType<'b>.
After creating this, I will want to create struct instances, think let bar = Bar { a: (), b: () };. At least currently, each struct instance needs to hold a reference to its type, so that type might look like this:
pub struct Struct<'b> {
struct_type: &'b bytecode::StructType<'b>,
fields: Vec<Value<'b>>,
}
The values of a struct's fields may be constants whose value is stored in the bytecode, so the Value enum has a lifetime 'b as well, and that works. The problem is that I have a &'b bytecode::StructType<'b> in the first field: how do I get a reference that lives long enough? I think the reference would actually be valid long enough.
The part of the code that I suspect to be the critical one is here:
pub fn struct_type(&self, _name: &str) -> Option<&'b StructType<'b>> {
// self.struct_types.get(name)
todo!("fix lifetime problems")
}
With the commented out code, I can't get a 'b reference because the reference self.struct_types lives too short; to fix that I'd need to do &'b self which would spread virally through the code; also, most of the times I need to borrow the Vm mutably, which doesn't work if all those exclusive self references have to live long.
Introducing a separate lifetime 'm so that I could return a &'m StructType<'b> sounds like something that I could try as well, but that sounds just as viral and in addition introduces a separate lifetime I need to keep track of; being able to replace 'b with 'm (or at least only having on in each place) would be a bit nicer.
Finally this feels like something that pinning could be helpful with, but I don't understand that topic enough to make an educated guess on how that could be approached.
MCVE
#![allow(dead_code)]
mod bytecode {
use std::collections::BTreeMap;
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, _name: &str) -> Option<&'b StructType<'b>> {
// self.struct_types.get(name)
todo!("fix lifetime problems")
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([
(bar, bar_struct),
]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'b> {
Unit,
Struct(Struct<'b>),
}
#[derive(Debug, Clone)]
pub struct Struct<'b> {
struct_type: &'b bytecode::StructType<'b>,
fields: Vec<Value<'b>>,
}
impl<'b> Struct<'b> {
pub fn new(struct_type: &'b bytecode::StructType<'b>, fields: Vec<Value<'b>>) -> Self {
Struct { struct_type, fields }
}
}
#[derive(Debug, Clone)]
pub struct Vm<'b> {
module: bytecode::Module<'b>,
}
impl<'b> Vm<'b> {
pub fn new(module: bytecode::Module<'b>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &'b str) -> Value<'b> {
let struct_type: &'b StructType<'b> = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
&'b bytecode::StructType<'b> is a classic anti-pattern in Rust, it strongly indicates incorrectly annotated lifetimes. It doesn't make sense that an object would depend on some lifetime and borrowing it creates the same lifetime. That is very rare to happen on purpose.
So I suspect you need two lifetimes, which I will call 'm and 'b:
'b: the lifetime of the bytecode string, everything that references it will use &'b str.
'm: the lifetime of the Module object. Everything that references it or its contained StructType will use this lifetime.
If split into two lifetimes and adjusted correctly, it simply works:
#![allow(dead_code)]
mod bytecode {
use std::{collections::BTreeMap, iter::FromIterator};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
self.struct_types.get(name)
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([(bar, bar_struct)]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'b, 'm> {
Unit,
Struct(Struct<'b, 'm>),
}
#[derive(Debug, Clone)]
pub struct Struct<'b, 'm> {
struct_type: &'m bytecode::StructType<'b>,
fields: Vec<Value<'b, 'm>>,
}
impl<'b, 'm> Struct<'b, 'm> {
pub fn new(struct_type: &'m bytecode::StructType<'b>, fields: Vec<Value<'b, 'm>>) -> Self {
Struct {
struct_type,
fields,
}
}
}
#[derive(Debug, Clone)]
pub struct Vm<'b> {
module: bytecode::Module<'b>,
}
impl<'b> Vm<'b> {
pub fn new(module: bytecode::Module<'b>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &str) -> Value<'b, '_> {
let struct_type: &StructType<'b> = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
Module { struct_types: {"Bar": Named(["a", "b"])} }
Struct(Struct { struct_type: Named(["a", "b"]), fields: [Unit, Unit] })
It can further be simplified, however, due to the fact that 'm is connected to 'b, and therefore everything that depends on 'm automatically also has access to 'b objects, because 'b is guaranteed to outlive 'm.
Therefore, let's introduce 'a, which we will now use inside of the vm mod to reference anything from the bytecode mod. This will further allow lifetime elysion to happen at a couple of points, simplifying the code even further:
#![allow(dead_code)]
mod bytecode {
use std::{collections::BTreeMap, iter::FromIterator};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
self.struct_types.get(name)
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([(bar, bar_struct)]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'a> {
Unit,
Struct(Struct<'a>),
}
#[derive(Debug, Clone)]
pub struct Struct<'a> {
struct_type: &'a bytecode::StructType<'a>,
fields: Vec<Value<'a>>,
}
impl<'a> Struct<'a> {
pub fn new(struct_type: &'a bytecode::StructType, fields: Vec<Value<'a>>) -> Self {
Struct {
struct_type,
fields,
}
}
}
#[derive(Debug, Clone)]
pub struct Vm<'a> {
module: bytecode::Module<'a>,
}
impl<'a> Vm<'a> {
pub fn new(module: bytecode::Module<'a>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &str) -> Value {
let struct_type: &StructType = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
Module { struct_types: {"Bar": Named(["a", "b"])} }
Struct(Struct { struct_type: Named(["a", "b"]), fields: [Unit, Unit] })
Fun fact: This is now one of the rare cases where we legitimately have to use &'a bytecode::StructType<'a>, so take my opening statement with a grain of salt, and you were kind of right all along :)
The crazy thing is if we then rename 'a to 'b to be consistent with your original code, we get almost your code with only some minor differences:
#![allow(dead_code)]
mod bytecode {
use std::{collections::BTreeMap, iter::FromIterator};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum StructType<'b> {
/// unit struct type; doesn't have fields
Empty,
/// tuple struct type; fields are positional
Positional(usize),
/// "normal" struct type; fields are named
Named(Vec<&'b str>),
}
impl<'b> StructType<'b> {
pub fn field_count(&self) -> usize {
match self {
Self::Empty => 0,
Self::Positional(field_count) => *field_count,
Self::Named(fields) => fields.len(),
}
}
}
#[derive(Debug, Clone)]
pub struct Module<'b> {
struct_types: BTreeMap<&'b str, StructType<'b>>,
}
impl<'b> Module<'b> {
// here is the problem: I would like to return a reference with lifetime 'b.
// from the point I start executing instructions, I know that I won't modify
// the module (particularly, I won't add entries to the map), so I think that
// lifetime should be possible - pinning? `&'b self` everywhere? idk
pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
self.struct_types.get(name)
}
}
pub fn parse<'b>(bytecode: &'b str) -> Module<'b> {
// this would use nom to parse actual bytecode
assert_eq!(bytecode, "struct Bar { a, b }");
let bar = &bytecode[7..10];
let a = &bytecode[13..14];
let b = &bytecode[16..17];
let fields = vec![a, b];
let bar_struct = StructType::Named(fields);
let struct_types = BTreeMap::from_iter([(bar, bar_struct)]);
Module { struct_types }
}
}
mod vm {
use crate::bytecode::{self, StructType};
#[derive(Debug, Clone)]
pub enum Value<'b> {
Unit,
Struct(Struct<'b>),
}
#[derive(Debug, Clone)]
pub struct Struct<'b> {
struct_type: &'b bytecode::StructType<'b>,
fields: Vec<Value<'b>>,
}
impl<'b> Struct<'b> {
pub fn new(struct_type: &'b bytecode::StructType, fields: Vec<Value<'b>>) -> Self {
Struct {
struct_type,
fields,
}
}
}
#[derive(Debug, Clone)]
pub struct Vm<'b> {
module: bytecode::Module<'b>,
}
impl<'b> Vm<'b> {
pub fn new(module: bytecode::Module<'b>) -> Self {
Self { module }
}
pub fn create_struct(&mut self, type_name: &str) -> Value {
let struct_type: &StructType = self.module.struct_type(type_name).unwrap();
// just initialize the fields to something, we don't care
let fields = vec![Value::Unit; struct_type.field_count()];
let value = Value::Struct(Struct::new(struct_type, fields));
value
}
}
}
pub fn main() {
// the bytecode contains all constants needed at runtime;
// we're just interested in how struct types are handled
// obviously the real bytecode is not as human-readable
let bytecode = "struct Bar { a, b }";
// we parse that into a module that, among other things,
// has a map of all struct types
let module = bytecode::parse(bytecode);
println!("{:?}", module);
// we create a Vm that is capable of running commands
// that are stored in the module
let mut vm = vm::Vm::new(module);
// now we try to execute an instruction to create a struct value
// the instruction for this contains a reference to the type name
// stored in the bytecode.
// the struct value contains a reference to its type and holds its field values.
let value = {
let bar = &bytecode[7..10];
vm.create_struct(bar)
};
println!("{:?}", value);
}
Module { struct_types: {"Bar": Named(["a", "b"])} }
Struct(Struct { struct_type: Named(["a", "b"]), fields: [Unit, Unit] })
So the actual fix for your original code is as follows:
4c4
< use std::collections::BTreeMap;
---
> use std::{collections::BTreeMap, iter::FromIterator};
36,38c36,37
< pub fn struct_type(&self, _name: &str) -> Option<&'b StructType<'b>> {
< // self.struct_types.get(name)
< todo!("fix lifetime problems")
---
> pub fn struct_type(&self, name: &str) -> Option<&StructType<'b>> {
> self.struct_types.get(name)
73c72
< pub fn new(struct_type: &'b bytecode::StructType<'b>, fields: Vec<Value<'b>>) -> Self {
---
> pub fn new(struct_type: &'b bytecode::StructType, fields: Vec<Value<'b>>) -> Self {
91,92c90,91
< pub fn create_struct(&mut self, type_name: &'b str) -> Value<'b> {
< let struct_type: &'b StructType<'b> = self.module.struct_type(type_name).unwrap();
---
> pub fn create_struct(&mut self, type_name: &str) -> Value {
> let struct_type: &StructType = self.module.struct_type(type_name).unwrap();
I hope deriving them step by step made it somewhat clear why those changes are necessary.
A typical AST implementation in C might look like:
typedef enum {
AST_IDENT,
AST_STRING,
} Ast_Tag;
typedef struct { Ast_Tag tag; u32 flags; } Ast;
typedef struct { Ast base; Intern *name; } Ast_Ident;
typedef struct { Ast base; Intern *str; } Ast_String;
// A function that downcasts to the derived type:
void foo (Ast *node) {
switch (node->tag) {
case AST_IDENT: {
Ast_Ident *ident = (Ast_Ident*)node;
do_something_with_name(ident->name);
} break;
case AST_STRING: {
Ast_String *str = (Ast_String*)node;
do_something_with_name(str->name);
} break;
}
}
// A function that upcasts to the base type:
void bar (Ast_Ident *ident) {
foo((Ast*)ident);
}
Is there a way to do this in rust? I'd imagine that downcasting would be particularly problematic.
NOTE: I'm not asking how to implement an AST, but how to replicate the struct inheritance as demonstrated above.
EDIT: Sorry for the confusion. The point here is that struct inheritance is used in order to avoid making every node be the same size, so enums are not an option.
You basically implement Rust enum variant:
struct Ast {
flags: u32,
kind: AstKind,
}
enum AstKind {
Ident(String),
String(String),
}
fn foo(ast: Ast) {
match ast.kind {
AstKind::Ident(ident) => println!("flags: {:?}, ident: {:?}", ast.flags, ident),
AstKind::String(s) => println!("string: {:?}, ident: {:?}", ast.flags, s),
}
}
fn bar(flags: u32, ident: String) {
foo(Ast {
flags,
kind: AstKind::Ident(ident),
})
}
fn main() {
bar(42, "MisterMV".into())
}
You seem to be concern with the size:
The point here is that struct inheritance is used in order to avoid making every node be the same size, so enums are not an option.
That generally not a concern, clippy have a lint call large_enum_variant, with a reasonable default that you can change. Clippy also propose a reasonable solution:
enum AstKind {
Ident(Box<Big>),
String(Small),
}
There are other alternatives that other answer already cover.
You can reference multiple types as a trait, but you can't cast the trait object to the exact type. To the rescue is the Any trait, which has a is, downcast_ref, and Box<dyn Any> has a downcast method. You can also get type IDs.
But there is no inheritance like you see in many OOP languages. That goal is accomplished by implementing traits.
That's impossible in safe Rust. You can however do it with transmute, provided that your structs are all repr(C):
pub enum AstTag {
AstIdent,
AstString,
}
#[repr(C)]
pub struct Ast { tag: AstTag, flags: u32, }
#[repr(C)]
pub struct AstIdent { base: Ast, name: String, }
#[repr(C)]
pub struct AstString { base: Ast, string: String, }
// A function that downcasts to the derived type:
pub fn foo (node: &Ast) {
use std::mem::transmute;
match node.tag {
AstTag::AstIdent => {
let ident: &AstIdent = unsafe { transmute (node) };
println!("{}", ident.name);
},
AstTag::AstString => {
let ident: &AstString = unsafe { transmute (node) };
println!("{}", ident.string);
}
}
}
// A function that upcasts to the base type:
pub fn bar (ident: &AstIdent) {
foo (&ident.base);
}
Playground
However this is very unidiomatic. A much more idiomatic way of doing things would be to put foo in a trait and implement that trait for each struct:
pub struct AstIdent { flags: u32, name: String, }
pub struct AstString { flags: u32, string: String, }
pub trait Foo {
fn foo (&self);
}
impl Foo for AstIdent {
fn foo (&self) {
println!("{}", self.name);
}
}
impl Foo for AstString {
fn foo (&self) {
println!("{}", self.string);
}
}
pub fn bar (ident: &AstIdent) {
ident.foo();
}
Playground
For a Gameboy emulator, you have two u8 fields for registers A and F, but they can sometimes be accessed as AF, a combined u16 register.
In C, it looks like you can do something like this:
struct {
union {
struct {
unsigned char f;
unsigned char a;
};
unsigned short af;
};
};
(Taken from here)
Is there a way in Rust, ideally without unsafe, of being able to access two u8s as registers.a/registers.f, but also be able to use them as the u16 registers.af?
I can give you a couple of ways to do it. First is a straightforward unsafe analogue but without boilerplate, the second one is safe but explicit.
Unions in rust are very similar, so you can translate it to this:
#[repr(C)]
struct Inner {
f: u8,
a: u8,
}
#[repr(C)]
union S {
inner: Inner,
af: u16,
}
// Usage:
// Putting data is safe:
let s = S { af: 12345 };
// but retrieving is not:
let a = unsafe { s.inner.a };
Or as an alternative you may manually do all of the explicit casts wrapped in a structure:
#[repr(transparent)]
// This is optional actually but allows a chaining,
// you may remove these derives and change method
// signatures to `&self` and `&mut self`.
#[derive(Clone, Copy)]
struct T(u16);
impl T {
pub fn from_af(af: u16) -> Self {
Self(af)
}
pub fn from_a_f(a: u8, f: u8) -> Self {
Self::from_af(u16::from_le_bytes([a, f]))
}
pub fn af(self) -> u16 {
self.0
}
pub fn f(self) -> u8 {
self.0.to_le_bytes()[0]
}
pub fn set_f(self, f: u8) -> Self {
Self::from_a_f(self.a(), f)
}
pub fn a(self) -> u8 {
self.0.to_le_bytes()[1]
}
pub fn set_a(self, a: u8) -> Self {
Self::from_a_f(a, self.f())
}
}
// Usage:
let t = T::from_af(12345);
let a = t.a();
let new_af = t.set_a(12).set_f(t.f() + 1).af();
I have a builder pattern implemented for my struct:
pub struct Struct {
pub grand_finals_modifier: bool,
}
impl Struct {
pub fn new() -> Struct {
Struct {
grand_finals_modifier: false,
}
}
pub fn grand_finals_modifier<'a>(&'a mut self, name: bool) -> &'a mut Struct {
self.grand_finals_modifier = grand_finals_modifier;
self
}
}
Is it possible in Rust to make a macro for methods like this to generalize and avoid a lot of duplicating code? Something that we can use as the following:
impl Struct {
builder_field!(hello, bool);
}
After reading the documentation, I've come up with this code:
macro_rules! builder_field {
($field:ident, $field_type:ty) => {
pub fn $field<'a>(&'a mut self,
$field: $field_type) -> &'a mut Self {
self.$field = $field;
self
}
};
}
struct Struct {
pub hello: bool,
}
impl Struct {
builder_field!(hello, bool);
}
fn main() {
let mut s = Struct {
hello: false,
};
s.hello(true);
println!("Struct hello is: {}", s.hello);
}
It does exactly what I need: creates a public builder method with specified name, specified member and type.
To complement the already accepted answer, since it is 4 years old by now, you should check out the crate rust-derive-builder. It uses procedural macros to automatically implement the builder pattern for any struct.
My FFI binding returns a struct with fixed-size c_char arrays, and I would like to turn those into std::ffi::CString or std::String.
It looks like the CString::new function coerces the pointer to a vector.
use std::ffi::CString;
use std::os::raw::c_char;
#[repr(C)]
pub struct FFIStruct {
pub Id: [::std::os::raw::c_char; 256usize],
pub Description: [::std::os::raw::c_char; 256usize],
}
fn get_struct() -> Option<FFIStruct> {
println!("cheating");
None
}
pub fn main() {
match get_struct() {
Some(thing) =>
println!("Got id:{}",CString::new(thing.Id.as_ptr())),
None => (),
}
}
Here is the Rust Playground link.
C strings that you don't own should be translated using CStr, not CString. You can then convert it into an owned representation (CString) or convert it into a String:
extern crate libc;
use libc::c_char;
use std::ffi::CStr;
pub fn main() {
let id = [0 as c_char; 256];
let rust_id = unsafe { CStr::from_ptr(id.as_ptr()) };
let rust_id = rust_id.to_owned();
println!("{:?}", rust_id);
}
You should also use the libc crate for types like c_char.
There is also this kind of solution:
fn zascii(slice: &[c_char]) -> String {
String::from_iter(slice.iter().take_while(|c| **c != 0).map(|c| *c as u8 as char))
}
you can create std::ffi::CStr from a pointer but you have to use unsafe keyword.Like this
use std::ffi::CStr;
//use std::os::raw::c_char;
#[repr(C)]
pub struct FFIStruct {
pub id: [::std::os::raw::c_char; 256usize],
pub description: [::std::os::raw::c_char; 256usize],
}
fn get_struct() -> Option<FFIStruct> {
println!("cheating");
None
}
pub fn main() {
match get_struct() {
Some(thing) =>
println!("Got id:{:?}",unsafe{CStr::from_ptr(thing.id.as_ptr())}),
None => (),
}
}
you can also convert CStr into String by using this method
CStr::from_ptr(thing.id.as_ptr()).to_string_lossy()