Deserialize Vec<u8> with nonstandard length encoding - rust

I am trying to use serde together with bincode to de-serialize an arbitrary bitcoin network message. Given that the payload is handled ubiquitously as a byte array, how do I de-serialize it when the length is unknown at compile-time? bincode does by default handle Vec<u8> by assuming it's length is encoded as u64 right before the elements of the vector. However, this assumption does not hold here because the checksum comes after the length of the payload.
I have the following working solution
Cargo.toml
[package]
name = "serde-test"
version = "0.1.0"
edition = "2018"
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_bytes = "0.11"
bincode = "1.3.3"
main.rs
use bincode::Options;
use serde::{Deserialize, Deserializer, de::{SeqAccess, Visitor}};
#[derive(Debug)]
struct Message {
// https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure
magic: u32,
command: [u8; 12],
length: u32,
checksum: u32,
payload: Vec<u8>,
}
struct MessageVisitor;
impl<'de> Visitor<'de> for MessageVisitor {
type Value = Message;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
formatter.write_str("Message")
}
fn visit_seq<V>(self, mut seq: V) -> Result<Self::Value, V::Error> where V: SeqAccess<'de>,
{
let magic = seq.next_element()?.unwrap();
let command = seq.next_element()?.unwrap();
let length: u32 = seq.next_element()?.unwrap();
let checksum = seq.next_element()?.unwrap();
let payload = (0..length).map(|_| seq.next_element::<u8>().unwrap().unwrap()).collect();
// verify payload checksum (omitted for brevity)
Ok(Message {magic, command, length, checksum, payload})
}
}
impl<'de> Deserialize<'de> for Message {
fn deserialize<D>(deserializer: D) -> Result<Message, D::Error> where D: Deserializer<'de>,
{
deserializer.deserialize_tuple(5000, MessageVisitor) // <-- overallocation
}
}
fn main() {
let bytes = b"\xf9\xbe\xb4\xd9version\x00\x00\x00\x00\x00e\x00\x00\x00_\x1ai\xd2r\x11\x01\x00\x01\x00\x00\x00\x00\x00\x00\x00\xbc\x8f^T\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xc6\x1bd\t \x8d\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xcb\x00q\xc0 \x8d\x12\x805\xcb\xc9yS\xf8\x0f/Satoshi:0.9.3/\xcf\x05\x05\x00\x01";
let msg: Message = bincode::DefaultOptions::new().with_fixint_encoding().deserialize(bytes).unwrap();
println!("{:?}", msg);
}
Output:
Message { magic: 3652501241, command: [118, 101, 114, 115, 105, 111, 110, 0, 0, 0, 0, 0], length: 101, checksum: 3530103391, payload: [114, 17, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 188, 143, 94, 84, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 198, 27, 100, 9, 32, 141, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 203, 0, 113, 192, 32, 141, 18, 128, 53, 203, 201, 121, 83, 248, 15, 47, 83, 97, 116, 111, 115, 104, 105, 58, 48, 46, 57, 46, 51, 47, 207, 5, 5, 0, 1] }
I dislike this solution because of how payload is handled. It requires me to allocate some "large enough" buffer to take into account the dynamic size of the payload, In the code snippet above 5000 is sufficient. I would much rather de-serialize payload as a single element and use deserializer.deserialize_tuple(5, MessageVisitor) instead.
Is there a way to handle this kind of deserialization in a succint manner?
Similar question I could find: Can I deserialize vectors with variable length prefix with Bincode?

Your problem is that the source message is not encoded as bincode, so you are doing weird things to treat non-bincode data as if it was.
Serde is designed for creating serializers and deserializers for general-purpose formats, but your message is in a very specific format that can only be interpreted one way.
A library like nom is much more suitable for this kind of work, but it may be overkill considering how simple the format is and you can just parse it from the bytes directly:
use std::convert::TryInto;
fn main() {
let bytes = b"\xf9\xbe\xb4\xd9version\x00\x00\x00\x00\x00e\x00\x00\x00_\x1ai\xd2r\x11\x01\x00\x01\x00\x00\x00\x00\x00\x00\x00\xbc\x8f^T\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xc6\x1bd\t \x8d\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xcb\x00q\xc0 \x8d\x12\x805\xcb\xc9yS\xf8\x0f/Satoshi:0.9.3/\xcf\x05\x05\x00\x01";
let (magic_bytes, bytes) = bytes.split_at(4);
let magic = u32::from_le_bytes(magic_bytes.try_into().unwrap());
let (command_bytes, bytes) = bytes.split_at(12);
let command = command_bytes.try_into().unwrap();
let (length_bytes, bytes) = bytes.split_at(4);
let length = u32::from_le_bytes(length_bytes.try_into().unwrap());
let (checksum_bytes, bytes) = bytes.split_at(4);
let checksum = u32::from_le_bytes(checksum_bytes.try_into().unwrap());
let payload = bytes[..length as usize].to_vec();
let msg = Message {
magic,
command,
length,
checksum,
payload,
};
println!("{:?}", msg);
}
There are hundreds of cryptocurrency projects in Rust and there are many crates already written for handling cryptocurrency data structures. These crates are battle-tested and will have much better error-handling (my example above has none). As mentioned in the comments, you can perhaps look at the bitcoin crate.

Related

How to stream a vector of bytes to BufWriter?

I'm trying to stream bytes to a tcp server by using io::copy(&mut reader, &mut writer), but it gives me this error: the trait "std::io::Read" is not implemented for "Vec<{integer}>". Here I have a vector of bytes which would be the same as me opening a file and converting it to bytes. I want to write the bytes to the BufWriter. What am I doing wrong?
use std::io;
use std::net::TcpStream;
use std::io::BufWriter;
pub fn connect() {
if let Ok(stream) = TcpStream::connect("localhost:8080") {
println!("Connection established!");
let mut reader = vec![
137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82, 0, 0, 0, 70, 0, 0, 0, 70,
];
let mut writer = BufWriter::new(&stream);
io::copy(&mut reader, &mut writer).expect("Failed to write to stream");
} else {
println!("Couldn't connect to the server")
}
}
error[E0277]: the trait bound `Vec<{integer}>: std::io::Read` is not satisfied
--> src/lib.rs:12:31
|
12 | io::copy(&mut reader, &mut writer).expect("Failed to write to stream");
| -------- ^^^^^^^^^^^ the trait `std::io::Read` is not implemented for `Vec<{integer}>`
| |
| required by a bound introduced by this call
|
note: required by a bound in `std::io::copy`
the compiler have a little trouble here, Vec doesn't implement Read but &[u8] do, you just have a get a slice from the vec before create a mutable reference:
copy(&mut reader.as_slice(), &mut writer).expect("Failed to write to stream");
See also:
What is the difference between storing a Vec vs a Slice?
What are the differences between Rust's `String` and `str`?
Using .as_slice() like so works for me:
pub fn connect() {
if let Ok(stream) = TcpStream::connect("localhost:8080") {
println!("Connection established!");
let reader = vec![
137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82, 0, 0, 0, 70, 0, 0, 0, 70,
];
let mut writer = BufWriter::new(&stream);
io::copy(&mut reader.as_slice(), &mut writer).expect("Failed to write to stream");
} else {
println!("Couldn't connect to the server")
}
}
That’s because std::io::Read supports slices.

How can i parse an itm.txt file?

I am new to embedded and Rust programming. I am working on sensor and I am able to get the readings of the sensor.
The problem:
The data is in the itm.txt file which includes headers and a variable-length payload, Data in itm.txt file. When I am getting the readings of the sensor I am getting this.
Can someone please help me to parse this data, I am finding no way in Rust to parse this data.
This is the code I am using to get the file content.
use std::io;
use std::fs::File;
use std::io::{Read, BufReader};
fn decode<R: io::Read>(mut r: R) -> io::Result<Vec<u8>> {
let mut output = vec![];
loop {
let mut len = 0u8;
r.read_exact(std::slice::from_mut(&mut len))?;
let len = 1 << (len - 1);
let mut buf = vec![0; len];
let res = r.read_exact(&mut buf);
if buf == b"\0" {
break;
}
output.extend(buf);
}
Ok(output)
}
fn main() {
//let data = "{ x: 579, y: -197 , z: -485 }\0";
let mut file = File::open("/tmp/itm.txt").expect("No such file is present at this location");
let mut buf_reader = Vec::new();
file.read_to_end(&mut buf_reader).expect("error");
let content = std::str::from_utf8(&buf_reader).unwrap();
println!("raw str: {:?}", content);
println!("raw hex: {:x?}", content.as_bytes());
let decoded = decode(content.as_bytes()).unwrap();
let s = std::str::from_utf8(&decoded).expect("Failed");
println!("decoded str: {:?}", s);
}
The error I am getting now
raw str: "\u{2}Un\u{3}scal\u{3}edMe\u{3}asur\u{3}emen\u{1}t\u{1} \u{2}{ \u{1}x\u{2}: \u{2}31\u{1}3\u{1},\u{1} \n\n"
raw hex: [2, 55, 6e, 3, 73, 63, 61, 6c, 3, 65, 64, 4d, 65, 3, 61, 73, 75, 72, 3, 65, 6d, 65, 6e, 1, 74, 1, 20, 2, 7b, 20, 1, 78, 2, 3a, 20, 2, 33, 31, 1, 33, 1, 2c, 1, 20, a, a]
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: UnexpectedEof, message: "failed to fill whole buffer" }', src/main.rs:51:46
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
This is the link to itm thing.

Make serde only produce hex strings for human-readable serialiser?

I'm currently using serde-hex.
use serde_hex::{SerHex,StrictPfx,CompactPfx};
#[derive(Debug,PartialEq,Eq,Serialize,Deserialize)]
struct Foo {
#[serde(with = "SerHex::<StrictPfx>")]
bar: [u8;4],
#[serde(with = "SerHex::<CompactPfx>")]
bin: u64
}
fn it_works() {
let foo = Foo { bar: [0,1,2,3], bin: 16 };
let ser = serde_json::to_string(&foo).unwrap();
let exp = r#"{"bar":"0x00010203","bin":"0x10"}"#;
assert_eq!(ser,exp);
// this fails
let binser = bincode::serialize(&foo).unwrap();
let binexp: [u8; 12] = [0, 1, 2, 3, 16, 0, 0, 0, 0, 0, 0, 0];
assert_eq!(binser,binexp);
}
fails with:
thread 'wire::era::tests::it_works' panicked at 'assertion failed: `(left == right)`
left: `[10, 0, 0, 0, 0, 0, 0, 0, 48, 120, 48, 48, 48, 49, 48, 50, 48, 51, 4, 0, 0, 0, 0, 0, 0, 0, 48, 120, 49, 48]`,
right: `[0, 1, 2, 3, 16, 0, 0, 0, 0, 0, 0, 0]`', src/test.rs:20:9
because it has expanded values to hex strings for bincode.
I have many structs which I need to serialise with both serde_json and bincode. serde_hex does exactly what I need for JSON serialisation. When using bincode serde-hex still transforms arrays into hex strings, which is not wanted.
I notice that secp256k1 uses d.is_human_readable().
How can I make serde_hex apply only to serde_json and be ignored for bincode?
The implementation of a function usable with serde's with-attribute is mostly boilerplate and looks like this.
This only differentiates between human-readable and other formats. If you need more fine-grained control, you could branch on a thread-local variable instead.
fn serialize_hex<S>(v: &u64, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
if serializer.is_human_readable() {
serde_hex::SerHex::<serde_hex::StrictPfx>::serialize(v, serializer)
} else {
v.serialize(serializer)
}
}
// use like
// #[serde(serialize_with = "serialize_hex")]
// bin: u64
The snippet could be improved by turning the u64 into a generic.

ACCESS_VIOLATION calling Btrieve BTRCALL function from Rust

I am attempting to call Btrieve (a very old database engine) from Rust.
This is a bit long, but this is my first attempt at FFI from Rust and
I wanted to describe everything I have done.
The Btrieve engine is implemented in a DLL, w3btrv7.dll, which is a
32-bit DLL. I have made an import library for it using 32-bit MSVC tools
(it doesn't come with an official one):
lib /Def:w3btrv7.def /Out:w3btrv7.lib /Machine:x86
I then installed the 32-bit Rust toolchain stable-i686-pc-windows-msvc
and set it as my default. Bindgen barfs on the official Btrieve headers
so I had to make my own. Luckily we only need to wrap a single function,
BTRCALL.
I have this in my wrapper.h:
short int BTRCALL(
unsigned short operation,
void* posBlock,
void* dataBuffer,
unsigned short* dataLength,
void* keyBuffer,
unsigned char keyLength,
char ckeynum);
I am linking as:
println!("cargo:rustc-link-lib=./src/pervasive/w3btrv7");
Which seems to work: the program runs, is a 32-bit exe, and I can
see in Process Explorer that it has loaded w3btrv7.dll.
When I send the header through bindgen I get:
extern "C" {
pub fn BTRCALL(
operation: ::std::os::raw::c_ushort,
posBlock: *mut ::std::os::raw::c_void,
dataBuffer: *mut ::std::os::raw::c_void,
dataLength: *mut ::std::os::raw::c_ushort,
keyBuffer: *mut ::std::os::raw::c_void,
keyLength: ::std::os::raw::c_uchar,
ckeynum: ::std::os::raw::c_char,
) -> ::std::os::raw::c_short;
}
The types and sizes all seem to tally up correctly, and they match
a DllImport I have from a C# application which works perfectly:
[DllImport("w3btrv7.dll", CharSet = CharSet.Ansi)]
private static extern short BTRCALL(
ushort operation, // In C#, ushort = UInt16.
[MarshalAs(UnmanagedType.LPArray, SizeConst = 128)] byte[] posBlock,
[MarshalAs(UnmanagedType.LPArray)] byte[] dataBuffer,
ref ushort dataLength,
[MarshalAs(UnmanagedType.LPArray)] byte[] keyBuffer,
byte keyLength, // unsigned byte
char keyNumber); // 2 byte char
The keyNumber is slightly different, but I have tried both bytes and shorts in both signed and unsigned variations, and it still doesn't work.
Unfortunately when I run my program it blows up after the first call
to BTRCALL. (Well, actually it's when the function that this call is in
returns). I've extracted all the params into local variables and checked
their types and all looks correct:
let op: u16 = 0;
let mut pos_block: [u8; 128] = self.pos_block.clone();
let pos_block_ptr: *mut std::ffi::c_void = pos_block.as_mut_ptr() as *mut _;
let mut data_buffer: [u8; 32768] = self.data_buffer.clone();
let data_buffer_ptr: *mut std::ffi::c_void = data_buffer.as_mut_ptr() as *mut _;
let mut data_length: u16 = data_buffer.len() as u16;
let mut key_buffer: [u8; 256] = self.key_buffer.clone();
let key_buffer_ptr: *mut std::ffi::c_void = key_buffer.as_mut_ptr() as *mut _;
let key_length: u8 = 255; //self.key_length;
let key_number: i8 = self.key_number.try_into().unwrap();
let status: i16 = BTRCALL(
op,
pos_block_ptr,
data_buffer_ptr,
&mut data_length,
key_buffer_ptr,
key_length,
key_number
);
It crashes the program with
error: process didn't exit successfully: `target\debug\blah.exe` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
From what I have read, this is probably due to an improper address access.
Indeed, when I put some tracing in to check the variables there is some very interesting behaviour, in that my
local variables which are passed by value seem to be getting overwritten. The log here just dumps the first
30 bytes of the buffers because the rest is just zeros:
pos_block = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
pos_block_ptr = 0xad6524
data_buffer = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
data_buffer_ptr = 0xad65a8
data_length = 32768
key_buffer = [34, 67, 58, 92, 116, 101, 109, 112, 92, 99, 115, 115, 92, 120, 100, 98, 92, 67, 65, 83, 69, 46, 68, 66, 34, 0, 0, 0, 0, 0]
key_buffer_ptr = 0xade5b0
key_length = 255
key_number = 0
>>>>>>>>>>>>>>> AFTER THE CALL TO BTRCALL:
pos_block = [0, 0, 0, 0, 0, 0, 0, 0, 0, 76, 203, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0]
pos_block_ptr = 0x0
data_buffer = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
data_buffer_ptr = 0x42442e45
data_length = 0
key_buffer = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
key_buffer_ptr = 0x0
key_length = 173
key_number = 0
BTRCALL() returned B_NO_ERROR
Notice pos_block_ptr has been set to 0, among other things. In contrast, a successful execution
of the exact same call from the C# code simply writes some data into the first 18 bytes of pos_block
and doesn't change any of the other variables.
It's as if it went a bit berserk and just started overwriting memory...
At this point I don't know what to try next.
Changing the declaration from extern "C" to extern "stdcall" works:
extern "stdcall" {
pub fn BTRCALL(
operation: ::std::os::raw::c_ushort,
posBlock: *mut ::std::os::raw::c_void,
dataBuffer: *mut ::std::os::raw::c_void,
dataLength: *mut ::std::os::raw::c_ushort,
keyBuffer: *mut ::std::os::raw::c_void,
keyLength: ::std::os::raw::c_uchar,
ckeynum: ::std::os::raw::c_char,
) -> ::std::os::raw::c_short;
}

Why do I get incorrect values when implementing HMAC-SHA256?

I'm trying to make a function in Rust that will return a HMAC-SHA256 digest. I've been working from the description at Wikipedia and RFC 2104.
I've been struggling with returning the correct HMAC. I'm using ring for the SHA256 digests but no matter what I try, I can't seem to get the right result. I suspect it might have something to do with .as_ref().to_vec() conversions. Even if that's true, I don't know how to continue from that. Not everything from RFC 2104 is implemented in the following code, but it highlights my issue.
extern crate ring;
use ring::{digest, test};
pub fn hmac(k: Vec<u8>, mut m: Vec<u8>) -> Vec<u8> {
// Initialize ipad and opad as byte vectors with SHA256 blocksize
let ipad = vec![0x5C; 64];
let opad = vec![0x36; 64];
// iround and oround are used to seperate the two steps with XORing
let mut iround = vec![];
let mut oround = vec![];
for count in 0..k.len() {
iround.push(k[count] ^ ipad[count]);
oround.push(k[count] ^ opad[count]);
}
iround.append(&mut m); // m is emptied here
iround = (digest::digest(&digest::SHA256, &iround).as_ref()).to_vec();
oround.append(&mut iround); // iround is emptied here
oround = (digest::digest(&digest::SHA256, &oround).as_ref()).to_vec();
let hashed_mac = oround.to_vec();
return hashed_mac;
}
#[test]
fn test_hmac_digest() {
let k = vec![0x61; 64];
let m = vec![0x62; 64];
let actual = hmac(k, m);
// Expected value taken from: https://www.freeformatter.com/hmac-generator.html#ad-output
let expected = test::from_hex("f6cbb37b326d36f2f27d294ac3bb46a6aac29c1c9936b985576041bfb338ae70").unwrap();
assert_eq!(actual, expected);
}
These are the digests:
Actual = [139, 141, 144, 52, 11, 3, 48, 112, 117, 7, 56, 151, 163, 65, 152, 195, 163, 164, 26, 250, 178, 100, 187, 230, 89, 61, 191, 164, 146, 228, 180, 62]
Expected = [246, 203, 179, 123, 50, 109, 54, 242, 242, 125, 41, 74, 195, 187, 70, 166, 170, 194, 156, 28, 153, 54, 185, 133, 87, 96, 65, 191, 179, 56, 174, 112]
As mentioned in a comment, you have swapped the bytes for the inner and outer padding. Refer back to the Wikipedia page:
o_key_pad = key xor [0x5c * blockSize] //Outer padded key
i_key_pad = key xor [0x36 * blockSize] //Inner padded key
Here's what my take on the function would look like. I believe it has less allocation:
extern crate ring;
use ring::{digest, test};
const BLOCK_SIZE: usize = 64;
pub fn hmac(k: &[u8], m: &[u8]) -> Vec<u8> {
assert_eq!(k.len(), BLOCK_SIZE);
let mut i_key_pad: Vec<_> = k.iter().map(|&k| k ^ 0x36).collect();
let mut o_key_pad: Vec<_> = k.iter().map(|&k| k ^ 0x5C).collect();
i_key_pad.extend_from_slice(m);
let hash = |v| digest::digest(&digest::SHA256, v);
let a = hash(&i_key_pad);
o_key_pad.extend_from_slice(a.as_ref());
hash(&o_key_pad).as_ref().to_vec()
}
#[test]
fn test_hmac_digest() {
let k = [0x61; BLOCK_SIZE];
let m = [0x62; BLOCK_SIZE];
let actual = hmac(&k, &m);
// Expected value taken from: https://www.freeformatter.com/hmac-generator.html#ad-output
let expected = test::from_hex("f6cbb37b326d36f2f27d294ac3bb46a6aac29c1c9936b985576041bfb338ae70").unwrap();
assert_eq!(actual, expected);
}

Resources