How can I serialize a `struct` with Strings using Borsh? - rust

I have this struct:
#[derive(BorshSerialize, BorshDeserialize, Debug)]
pub struct User {
pub name: String,
}
And along with it a program to simply associate it with a Solana account and set a name:
pub fn process_instruction(
_program_id: &Pubkey,
accounts: &[AccountInfo],
_instruction_data: &[u8],
) -> ProgramResult {
let accounts_iter = &mut accounts.iter();
let account = next_account_info(accounts_iter)?;
let mut user_account = User::try_from_slice(&account.data.borrow())?;
user_account.name = String::from("John Doe");
msg!("{:?}", user_account);
user_account.serialize(&mut &mut account.data.borrow_mut()[..])?;
Ok(())
}
I am running this contract on a localhost server. Most of this code is taken from the Solana Labs "Hello, World!" example (with the main changes being mentioned above).
The error comes at the second-to-last line, when trying to serialize user_account:
Transaction simulation failed: Error processing Instruction 0: Failed to serialize or deserialize account data: Unknown
Program 6rJVDwbQZUvpK5WKYNtYWgrSXym58DC7V7EV2YnpdEAk invoke [1]
Program log: User { name: "John Doe" }
Program 6rJVDwbQZUvpK5WKYNtYWgrSXym58DC7V7EV2YnpdEAk consumed 4839 of 1400000 compute units
Program 6rJVDwbQZUvpK5WKYNtYWgrSXym58DC7V7EV2YnpdEAk failed: Failed to serialize or deserialize account data: Unknown
My goal with this program is to be able to have a String field in a struct, and have it be serialized by Borsh.
Interestingly enough, replacing String::from("John Doe") with String::new() or String::from("") does not yield the error, but the program succeeds in this case with the name being set to an empty string.

If you look at the formal specification for Borsh, it says that strings are encoded as:
String, string_type: "String", encoded = utf8_encoding(x) as Vec<u8>, repr(encoded.len() as u32) repr(encoded as Vec<u8>)
Which means that it encodes the length of the string as a u32, followed by the utf-8 encoded string bytes. For the string "John Doe", you need to have 12 bytes, 4 for the length, and 8 for the contents.
If you want to do this programmatically, you either need to do system_program::create_account with the calculated bytes beforehand, or you can do create_account on-chain using a program-derived address.
More information about the Borsh specification at: https://github.com/near/borsh#specification

Related

How can I simultaneously access data from and call a method from a struct?

For my first project I wanted to create a terminal implementation of Monopoly. I have created a Card, Player, and App structs. Originally my plan was to create an array inside the App struct holding a list of cards which I could then randomly select and run an execute() method on, which would push a log to the logs field of the App. What I thought would be best was this:
pub struct Card {
pub group: u8,
pub name: String,
pub id: u8,
}
impl Card {
pub fn new(group: u8, name: String, id: u8) -> Card {
Card { group, name, id }
}
pub fn execute(self, app: &mut App) {
match self.id {
1 => {
app.players[app.current_player].index = 24;
app.push_log("this works".to_string(), "LOG: ".to_string());
}
_ => {}
}
}
}
pub struct Player<'a> {
pub name: &'a str,
pub money: u64,
pub index: u8,
pub piece: char,
pub properties: Vec<u8>,
pub goojf: u8, // get out of jail freeh
}
impl<'a> Player<'a> {
pub fn new(name: &'a str, piece: char) -> Player<'a> {
Player {
name,
money: 1500,
index: 0,
piece,
properties: Vec::new(),
goojf: 0,
}
}
}
pub struct App<'a> {
pub cards: [Card; 1],
pub logs: Vec<(String, String)>,
pub players: Vec<Player<'a>>,
pub current_player: usize,
}
impl<'a> App<'a> {
pub fn new() -> App<'a> {
App {
cards: [Card::new(
11,
"Take a ride on the Penn. Railroad!".to_string(),
0,
)],
logs: vec![(
String::from("You've begun a game!"),
String::from("BEGIN!:"),
)],
players: vec![Player::new("Joe", '#')],
current_player: 0,
}
}
pub fn draw_card(&mut self) {
if self.players[self.current_player].index == 2
|| self.players[self.current_player].index == 17
|| self.players[self.current_player].index == 33
{
self.cards[0].execute(self);
}
}
pub fn push_log(&mut self, message: String, status: String) {
self.logs.push((message, status));
}
}
fn main() {}
However this code throws the following error:
error[E0507]: cannot move out of `self.cards[_]` which is behind a mutable reference
--> src/main.rs:76:13
|
76 | self.cards[0].execute(self);
| ^^^^^^^^^^^^^ move occurs because `self.cards[_]` has type `Card`, which does not implement the `Copy` trait
I managed to fix this error by simply declaring the array of cards in the method itself, however, this seem to be pretty brute and not at all efficient, especially since other methods in my program depend on cards. How could I just refer to a single array of cards for all of my methods implemented in App or elsewhere?
As others have pointed out, Card::execute() needs to take &self instead of self, but then you run into the borrow checker issue, which I'll spend the rest of this answer discussing.
It may seem odd, but Rust is actually protecting you here. The borrow checker does not look into functions to see what they do, so it has no idea that Card::execute() won't do something to invalidate the referenced passed as the first argument. For example, if App::cards was a vector instead of an array, it could clear the vector.
Something that could actually practically happen here to cause undefined behavior would be if Card::execute() took a string slice from self.name and then cleared the card's name attribute through the mutable reference to app. None of these actions would be prohibited, and you'd be left with an invalid reference to a string slice. This is why the borrow checker isn't letting you make this method call, and this is exactly the kind of accident that Rust is designed to prevent.
There's a few ways around this. One option is to only pass the pieces of the App value needed to complete the task. You can borrow different parts of the same value. The problem here is that the reborrow of self overlaps with the borrow self.cards[0]. Passing each field separately isn't very ergonomic though, as in this case you'll wind up having to pass a reference to pretty much everything else on App.
It looks like the Card values don't actually contain any game state, and are used as data for the game engine. If this is the case, then the cards can live outside of App, like so:
pub struct App<'a> {
// Changed to an unowned slice.
pub cards: &'a [Card],
pub logs: Vec<(String, String)>,
pub players: Vec<Player<'a>>,
pub current_player: usize,
}
impl<'a> App<'a> {
pub fn new(cards: &'a [Card]) -> App<'a> {
App {
cards,
// ...
Then in your main() you can initialize the data and borrow it:
fn main() {
let cards: [Card; 1] = [Card::new(
11,
"Take a ride on the Penn. Railroad!".to_string(),
0,
)];
let app = App::new(&cards);
}
This solves the compilation problem. I'd suggest making other changes, as well:
App should probably be renamed GameState or something to emphasize that this struct should contain only mutable game state, and no immutable reference data.
Player's name field should probably be an owned String instead of an unowned &str, otherwise some other entity in the program will need to own a string slice for the duration of the program so that Player can borrow it.
Does a card need to be able to mutate the cards in self? If you know that execute will not need access to cards, the easiest solution is to split App into further structs so you can better limit the access of the function.
Unless there is another layer to your program that interacts with App as a singular object, requiring everything be in a single struct to perform operations will likely only constrain your code. If possible it is better to split it into its core components so you can be more selective when sharing references and avoid needing to make so many fields pub.
Here is a rough example:
pub struct GameState<'a> {
pub players: Vec<Player<'a>>,
pub current_player: usize,
}
/// You may want to look into using https://crates.io/crates/log instead to make logging easier.
pub struct Logs {
entries: Vec<(String, String)>,
}
impl Logs {
/// Use ToOwned so you can use both both &str and String
pub fn push<S: ToOwned<String>>(&mut self, message: S, status: S) {
self.entries.push((message.to_owned(), status.to_owned()));
}
}
pub struct Card {
group: u8,
name: String,
id: u8,
}
impl Card {
pub fn new(group: u8, name: String, id: u8) -> Self {
Card { group, name, id }
}
/// Use &self because there is no reason we need are required to consume the card
pub fn execute(&self, state: &mut GameState, logs: &mut Logs) {
if self.id == 1{
state.players[state.current_player].index = 24;
logs.push("this works", "LOG");
}
}
}
Also as a side note, execute consumes a card when called since it does not take a reference. Since Card does not implement Copy, that would require it be removed from cards so it can be moved.
Misc Tips and Code Review
Using IDs
It looks like you frequently use IDs to distinguish between items. However, I think your code will look cleaner and be easier to write if you used more human readable types. For example, do cards need an ID? Usually it is preferable to define your struct based on how the data is used. Last time I played monopoly, I don't remember picking up a card and referring to the card ID to determine what to do. I would instead recommend defining a Card by how it is used in the came. If I am remembering correctly, each card contains a short message telling the player what to do. Technically a card could consist of just the message, but you can instead make your code a bit cleaner by separating out the action to an enum so actions are not hard coded to the text on the card.
pub struct Card {
text: String,
action: CardAction,
}
// Note: enums which can also hold data, are more commonly referred to as
// "tagged unions" in computer science. It can be a pain to search for them if
// you don't know what they are called.
pub enum CardAction {
ProceedToGo,
GoToJail,
GainOrLoseMoney(i64),
// etc.
}
On a similar note, it looks like you are trying to be memory conscious by using the smallest type required for a given value. I would recommend against this thinking. If a value of 253u8 is equally as invalid as 324234i32, then the smaller type is not doing anything to help you. You might as well use i32/u32 or i64/u64 since most systems will have an easier time operating on these types. The same thing goes for indices and using other integer types instead of usize since choosing to use another type will only give you more work converting it to and from a usize.
Sharing Owned References
Depending on your design philosophy you might want to store a reference to a struct in multiple places. This can be done using a reference counter Rc<T>. Here are some quick examples. Note that these can not be shared between threads.
let property: Rc<Property> = Rc::new(Property::new(/* etc */));
// Holds an owned reference to the same property as property that can be accessed immutably
let ref_to_property: Rc<Property> = property.clone();
// Or if you want interior mutability you can use Rc<RefCell<T>> instead.
let mutable_property = Rc::new(RefCell::new(Property::new(/* etc */)));

How to write a Vec of structs to a file? [duplicate]

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);
After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?
I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.
A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.
In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.
Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.
Heres a working example based on the question:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::core::slice::from_raw_parts(
(p as *const T) as *const u8,
::core::mem::size_of::<T>(),
)
}
fn main() {
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
// tcp_stream.write(bytes);
println!("{:?}", bytes);
}
Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.
Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.
Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior).
If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.
Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.
(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)
Perhaps a solution like bincode would suit your case? Here's a working excerpt:
Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper#example.com>"]
edition = "2018"
[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }
main.rs
use serde::{Deserialize, Serialize};
use std::fs::File;
#[derive(Serialize, Deserialize)]
struct A {
id: i8,
key: i16,
name: String,
values: Vec<String>,
}
fn main() {
let a = A {
id: 42,
key: 1337,
name: "Hello world".to_string(),
values: vec!["alpha".to_string(), "beta".to_string()],
};
// Encode to something implementing `Write`
let mut f = File::create("/tmp/output.bin").unwrap();
bincode::serialize_into(&mut f, &a).unwrap();
// Or just to a buffer
let bytes = bincode::serialize(&a).unwrap();
println!("{:?}", bytes);
}
You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

Map C-like packed data structure to Rust struct

I'm fairly new to Rust and have spent most of my time writing code in C/C++. I have a flask webserver that returns back a packed data structure in the form of length + null-terminated string:
test_data = "Hello there bob!" + "\x00"
test_data = test_data.encode("utf-8")
data = struct.pack("<I", len(test_data ))
data += test_data
return data
In my rust code, I'm using the easy_http_request crate and can successfully get the response back by calling get_from_url_str. What I'm trying to do is map the returned response back to the Test data structure (if possible). I've attempted to use align_to to unsuccessfully get the string data mapped to the structure.
extern crate easy_http_request;
extern crate libc;
use easy_http_request::DefaultHttpRequest;
use libc::c_char;
#[repr(C, packed)]
#[derive(Debug, Clone, Copy)]
struct Test {
a: u32,
b: *const c_char // TODO: What do I put here???
}
fn main() {
let response = DefaultHttpRequest::get_from_url_str("http://localhost:5000/").unwrap().send().unwrap();
let (head, body, _tail) = unsafe { response.body.align_to::<Test>() };
let my_test: Test = body[0];
println!("{}", my_test.a); // Correctly prints '17'
println!("{:?}", my_test.b); // Fails
}
I'm not sure this is possible in Rust. In the response.body I can correctly see the null-terminated string, so I know the data is there. Just unsure if there's a way to map it to a string in the Test structure. There's no reason I need to use a null-terminated string. Ultimately, I'm just trying to map a data structure of size and a string to a Rust struct of the similar types.
It looks like you are confused by two different meanings of pack:
* In Python, pack is a protocol of sorts to serialize data into an array of bytes.
* In Rust, pack is a directive added to a struct to remove padding between members and disable other weirdness.
While they can be use together to make a protocol work, that is not the case, because in your pack you have a variable-length member. And trying to serialize/deserialize a pointer value directly is a very bad idea.
Your packed flask message is basically:
4 bytes litte endian value with the number of bytes in the string.
so many bytes indicated above for the string, encoded in utf-8.
For that you do not need a packed struct. The easiest way is to just read the fields manually, one by one. Something like this (error checking omitted):
use std::convert::TryInto;
let a = i32::from_le_bytes(response[0..4].try_into().unwrap());
let b = std::str::from_utf8(&response[4 .. 4 + a as usize]).unwrap();
Don't use raw pointers, they are unsafe to use and recommended only when there are strong reasons to
get around Rust’s safety guarantees.
At minumum a struct that fits your requirement is something like:
struct Test<'a> {
value: &'a str
}
or a String owned value for avoiding lifetime dependencies.
A reference to &str comprises a len and a pointer (it is not a C-like char * pointer).
By the way, the hard part is not the parsing of the wire protocol but to manage correctly all the possible
decoding errors and avoid unexpected runtime failures in case of buggy or malicious clients.
In order not to reinvent the wheel, an example with the parse combinator nom:
use nom::{
number::complete::le_u32,
bytes::complete::take,
error::ErrorKind,
IResult
};
use easy_http_request::DefaultHttpRequest;
use std::str::from_utf8;
#[derive(Debug, Clone)]
struct Test {
value: String
}
fn decode_len_value(bytes: &[u8]) -> IResult<&[u8], Test> {
let (buffer, len) = le_u32(bytes)?;
// take len-1 bytes because null char (\0) is accounted into len
let (remaining, val) = take(len-1)(buffer)?;
match from_utf8(val) {
Ok(strval) => Ok((remaining, Test {value: strval.to_owned()})),
Err(_) => Err(nom::Err::Error((remaining, ErrorKind::Char)))
}
}
fn main() {
let response = DefaultHttpRequest::get_from_url_str("http://localhost:5000/").unwrap().send().unwrap();
let result = decode_len_value(&response.body);
println!("{:?}", result);
}

How do I encode or pack a struct into bytes without using an external library? [duplicate]

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);
After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?
I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.
A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.
In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.
Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.
Heres a working example based on the question:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::core::slice::from_raw_parts(
(p as *const T) as *const u8,
::core::mem::size_of::<T>(),
)
}
fn main() {
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
// tcp_stream.write(bytes);
println!("{:?}", bytes);
}
Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.
Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.
Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior).
If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.
Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.
(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)
Perhaps a solution like bincode would suit your case? Here's a working excerpt:
Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper#example.com>"]
edition = "2018"
[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }
main.rs
use serde::{Deserialize, Serialize};
use std::fs::File;
#[derive(Serialize, Deserialize)]
struct A {
id: i8,
key: i16,
name: String,
values: Vec<String>,
}
fn main() {
let a = A {
id: 42,
key: 1337,
name: "Hello world".to_string(),
values: vec!["alpha".to_string(), "beta".to_string()],
};
// Encode to something implementing `Write`
let mut f = File::create("/tmp/output.bin").unwrap();
bincode::serialize_into(&mut f, &a).unwrap();
// Or just to a buffer
let bytes = bincode::serialize(&a).unwrap();
println!("{:?}", bytes);
}
You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

How to convert 'struct' to '&[u8]'?

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);
After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?
I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.
A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.
In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.
Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.
Heres a working example based on the question:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::core::slice::from_raw_parts(
(p as *const T) as *const u8,
::core::mem::size_of::<T>(),
)
}
fn main() {
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
// tcp_stream.write(bytes);
println!("{:?}", bytes);
}
Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.
Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.
Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior).
If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.
Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.
(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)
Perhaps a solution like bincode would suit your case? Here's a working excerpt:
Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper#example.com>"]
edition = "2018"
[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }
main.rs
use serde::{Deserialize, Serialize};
use std::fs::File;
#[derive(Serialize, Deserialize)]
struct A {
id: i8,
key: i16,
name: String,
values: Vec<String>,
}
fn main() {
let a = A {
id: 42,
key: 1337,
name: "Hello world".to_string(),
values: vec!["alpha".to_string(), "beta".to_string()],
};
// Encode to something implementing `Write`
let mut f = File::create("/tmp/output.bin").unwrap();
bincode::serialize_into(&mut f, &a).unwrap();
// Or just to a buffer
let bytes = bincode::serialize(&a).unwrap();
println!("{:?}", bytes);
}
You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

Resources