ACCESS_VIOLATION calling Btrieve BTRCALL function from Rust - rust

I am attempting to call Btrieve (a very old database engine) from Rust.
This is a bit long, but this is my first attempt at FFI from Rust and
I wanted to describe everything I have done.
The Btrieve engine is implemented in a DLL, w3btrv7.dll, which is a
32-bit DLL. I have made an import library for it using 32-bit MSVC tools
(it doesn't come with an official one):
lib /Def:w3btrv7.def /Out:w3btrv7.lib /Machine:x86
I then installed the 32-bit Rust toolchain stable-i686-pc-windows-msvc
and set it as my default. Bindgen barfs on the official Btrieve headers
so I had to make my own. Luckily we only need to wrap a single function,
BTRCALL.
I have this in my wrapper.h:
short int BTRCALL(
unsigned short operation,
void* posBlock,
void* dataBuffer,
unsigned short* dataLength,
void* keyBuffer,
unsigned char keyLength,
char ckeynum);
I am linking as:
println!("cargo:rustc-link-lib=./src/pervasive/w3btrv7");
Which seems to work: the program runs, is a 32-bit exe, and I can
see in Process Explorer that it has loaded w3btrv7.dll.
When I send the header through bindgen I get:
extern "C" {
pub fn BTRCALL(
operation: ::std::os::raw::c_ushort,
posBlock: *mut ::std::os::raw::c_void,
dataBuffer: *mut ::std::os::raw::c_void,
dataLength: *mut ::std::os::raw::c_ushort,
keyBuffer: *mut ::std::os::raw::c_void,
keyLength: ::std::os::raw::c_uchar,
ckeynum: ::std::os::raw::c_char,
) -> ::std::os::raw::c_short;
}
The types and sizes all seem to tally up correctly, and they match
a DllImport I have from a C# application which works perfectly:
[DllImport("w3btrv7.dll", CharSet = CharSet.Ansi)]
private static extern short BTRCALL(
ushort operation, // In C#, ushort = UInt16.
[MarshalAs(UnmanagedType.LPArray, SizeConst = 128)] byte[] posBlock,
[MarshalAs(UnmanagedType.LPArray)] byte[] dataBuffer,
ref ushort dataLength,
[MarshalAs(UnmanagedType.LPArray)] byte[] keyBuffer,
byte keyLength, // unsigned byte
char keyNumber); // 2 byte char
The keyNumber is slightly different, but I have tried both bytes and shorts in both signed and unsigned variations, and it still doesn't work.
Unfortunately when I run my program it blows up after the first call
to BTRCALL. (Well, actually it's when the function that this call is in
returns). I've extracted all the params into local variables and checked
their types and all looks correct:
let op: u16 = 0;
let mut pos_block: [u8; 128] = self.pos_block.clone();
let pos_block_ptr: *mut std::ffi::c_void = pos_block.as_mut_ptr() as *mut _;
let mut data_buffer: [u8; 32768] = self.data_buffer.clone();
let data_buffer_ptr: *mut std::ffi::c_void = data_buffer.as_mut_ptr() as *mut _;
let mut data_length: u16 = data_buffer.len() as u16;
let mut key_buffer: [u8; 256] = self.key_buffer.clone();
let key_buffer_ptr: *mut std::ffi::c_void = key_buffer.as_mut_ptr() as *mut _;
let key_length: u8 = 255; //self.key_length;
let key_number: i8 = self.key_number.try_into().unwrap();
let status: i16 = BTRCALL(
op,
pos_block_ptr,
data_buffer_ptr,
&mut data_length,
key_buffer_ptr,
key_length,
key_number
);
It crashes the program with
error: process didn't exit successfully: `target\debug\blah.exe` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
From what I have read, this is probably due to an improper address access.
Indeed, when I put some tracing in to check the variables there is some very interesting behaviour, in that my
local variables which are passed by value seem to be getting overwritten. The log here just dumps the first
30 bytes of the buffers because the rest is just zeros:
pos_block = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
pos_block_ptr = 0xad6524
data_buffer = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
data_buffer_ptr = 0xad65a8
data_length = 32768
key_buffer = [34, 67, 58, 92, 116, 101, 109, 112, 92, 99, 115, 115, 92, 120, 100, 98, 92, 67, 65, 83, 69, 46, 68, 66, 34, 0, 0, 0, 0, 0]
key_buffer_ptr = 0xade5b0
key_length = 255
key_number = 0
>>>>>>>>>>>>>>> AFTER THE CALL TO BTRCALL:
pos_block = [0, 0, 0, 0, 0, 0, 0, 0, 0, 76, 203, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0]
pos_block_ptr = 0x0
data_buffer = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
data_buffer_ptr = 0x42442e45
data_length = 0
key_buffer = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
key_buffer_ptr = 0x0
key_length = 173
key_number = 0
BTRCALL() returned B_NO_ERROR
Notice pos_block_ptr has been set to 0, among other things. In contrast, a successful execution
of the exact same call from the C# code simply writes some data into the first 18 bytes of pos_block
and doesn't change any of the other variables.
It's as if it went a bit berserk and just started overwriting memory...
At this point I don't know what to try next.

Changing the declaration from extern "C" to extern "stdcall" works:
extern "stdcall" {
pub fn BTRCALL(
operation: ::std::os::raw::c_ushort,
posBlock: *mut ::std::os::raw::c_void,
dataBuffer: *mut ::std::os::raw::c_void,
dataLength: *mut ::std::os::raw::c_ushort,
keyBuffer: *mut ::std::os::raw::c_void,
keyLength: ::std::os::raw::c_uchar,
ckeynum: ::std::os::raw::c_char,
) -> ::std::os::raw::c_short;
}

Related

Sharing reference from a mutable method

Rust does not allow borrowing multiple mutable references. I understand that. But I can not find any elegant way to implement a few algorithms. Below is a simplified version of one such algorithm. The Ladder struct hands out slices of ever increasing sequence of numbers, such as, [0], [0, 1], [0, 1, 2] and so on.
struct Ladder {
position: usize,
data: [u8; 10],
}
impl Ladder {
fn get_next(&mut self) -> &[u8] {
self.position += 1;
&(self.data[0..self.position])
}
fn new() -> Ladder {
Ladder {
position: 0,
data: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
}
}
}
I need to call get_next() a couple of times, collect the returned sequences and call a closure that will do something with those sequences.
fn test_ladder(consumer: impl Fn(&[&[u8]])) {
let mut l = Ladder::new();
let mut steps: [&[u8]; 3] = [&[]; 3];
steps[0] = l.get_next();
steps[1] = l.get_next();
steps[2] = l.get_next();
consumer(&steps);
}
fn main() {
test_ladder(|steps| {
for seq in steps {
println!("{:?}", *seq);
}
});
}
It is a non-allocating algorithm. I can not use std::Vec.
What is the idiomatic way to approach problems like this?
The problem here is that you can't keep references to something that you mutate, and .get_next() is allowed to mutate data. What you need to do is separate the data from the mutation. You can do that by only keeping a reference to the original data.
Creating a sequence of elements sounds a lot like an iterator, so here's an example:
struct LadderIter<'a> {
position: usize,
data: &'a [u8],
}
impl<'a> LadderIter<'a> {
fn new(data: &'a [u8]) -> LadderIter<'a> {
LadderIter { position: 0, data }
}
}
impl<'a> Iterator for LadderIter<'a> {
type Item = &'a [u8];
fn next(&mut self) -> Option<Self::Item> {
if self.position == self.data.len() {
None
} else {
self.position += 1;
Some(&self.data[0..self.position])
}
}
}
Which you can then use as an iterator:
for step in LadderIter::new(&[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) {
println!("{step:?}");
}
[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Or in your specific use-case:
let data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
let mut ladder = LadderIter::new(&data);
let steps: [&[u8]; 3] = [
ladder.next().unwrap(),
ladder.next().unwrap(),
ladder.next().unwrap(),
];
Another approach is to use interior mutability. Since you are only modifying position, you can use the zero-cost Cell:
use std::cell::Cell;
struct Ladder {
position: Cell<usize>,
data: [u8; 10],
}
impl Ladder {
fn get_next(&self) -> &[u8] {
self.position.set(self.position.get() + 1);
&self.data[0..self.position.get()]
}
fn new() -> Ladder {
Ladder {
position: Cell::new(0),
data: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
}
}
}

How to initialize an array in a struct definition?

How can I set the array values to 0 in this struct? This is obviously wrong. How do I do it correctly?
struct Game {
board: [[i32; 3]; 3] = [[0, 0, 0], [0, 0, 0], [0, 0, 0]];
}
In a function this would have been:
let board: [[i32; 3]; 3] = [[0, 0, 0], [0, 0, 0], [0, 0, 0]];
You cannot initialize fields in struct definition because it is behaviour while struct must contain only data.
This should work:
struct Game {
board: [[i32; 3]; 3]
}
impl Game{
fn new()->Self{
Self{
board: [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
}
}
}
...
let game = Game::new();
If you want to define a default value for a struct, you can implement the Default trait for it.
In the case of a struct containing values that themselves implement Default, it is as simple as adding #[derive(Default)]:
#[derive(Default,Debug)]
struct Game {
board: [[i32; 3]; 3]
}
fn main() {
let game : Game = Default::default();
println!("{:?}", game);
}
Alternatively, if your struct is more complex, you can implement Default by hand.
Playground
The advantage of using Default over writing a constructor (as in Angelicos' answer) is that:
You can use derive to implement it
Data structures which contain your struct can also use derive
You can use the ..Default::default() struct update syntax to specify some fields of a struct, while defaulting the rest.
See also:
The Default Trait in "Rust Design Patterns"
Derivable Traits in "The Rust Book"

Deserialize Vec<u8> with nonstandard length encoding

I am trying to use serde together with bincode to de-serialize an arbitrary bitcoin network message. Given that the payload is handled ubiquitously as a byte array, how do I de-serialize it when the length is unknown at compile-time? bincode does by default handle Vec<u8> by assuming it's length is encoded as u64 right before the elements of the vector. However, this assumption does not hold here because the checksum comes after the length of the payload.
I have the following working solution
Cargo.toml
[package]
name = "serde-test"
version = "0.1.0"
edition = "2018"
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_bytes = "0.11"
bincode = "1.3.3"
main.rs
use bincode::Options;
use serde::{Deserialize, Deserializer, de::{SeqAccess, Visitor}};
#[derive(Debug)]
struct Message {
// https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure
magic: u32,
command: [u8; 12],
length: u32,
checksum: u32,
payload: Vec<u8>,
}
struct MessageVisitor;
impl<'de> Visitor<'de> for MessageVisitor {
type Value = Message;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
formatter.write_str("Message")
}
fn visit_seq<V>(self, mut seq: V) -> Result<Self::Value, V::Error> where V: SeqAccess<'de>,
{
let magic = seq.next_element()?.unwrap();
let command = seq.next_element()?.unwrap();
let length: u32 = seq.next_element()?.unwrap();
let checksum = seq.next_element()?.unwrap();
let payload = (0..length).map(|_| seq.next_element::<u8>().unwrap().unwrap()).collect();
// verify payload checksum (omitted for brevity)
Ok(Message {magic, command, length, checksum, payload})
}
}
impl<'de> Deserialize<'de> for Message {
fn deserialize<D>(deserializer: D) -> Result<Message, D::Error> where D: Deserializer<'de>,
{
deserializer.deserialize_tuple(5000, MessageVisitor) // <-- overallocation
}
}
fn main() {
let bytes = b"\xf9\xbe\xb4\xd9version\x00\x00\x00\x00\x00e\x00\x00\x00_\x1ai\xd2r\x11\x01\x00\x01\x00\x00\x00\x00\x00\x00\x00\xbc\x8f^T\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xc6\x1bd\t \x8d\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xcb\x00q\xc0 \x8d\x12\x805\xcb\xc9yS\xf8\x0f/Satoshi:0.9.3/\xcf\x05\x05\x00\x01";
let msg: Message = bincode::DefaultOptions::new().with_fixint_encoding().deserialize(bytes).unwrap();
println!("{:?}", msg);
}
Output:
Message { magic: 3652501241, command: [118, 101, 114, 115, 105, 111, 110, 0, 0, 0, 0, 0], length: 101, checksum: 3530103391, payload: [114, 17, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 188, 143, 94, 84, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 198, 27, 100, 9, 32, 141, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 203, 0, 113, 192, 32, 141, 18, 128, 53, 203, 201, 121, 83, 248, 15, 47, 83, 97, 116, 111, 115, 104, 105, 58, 48, 46, 57, 46, 51, 47, 207, 5, 5, 0, 1] }
I dislike this solution because of how payload is handled. It requires me to allocate some "large enough" buffer to take into account the dynamic size of the payload, In the code snippet above 5000 is sufficient. I would much rather de-serialize payload as a single element and use deserializer.deserialize_tuple(5, MessageVisitor) instead.
Is there a way to handle this kind of deserialization in a succint manner?
Similar question I could find: Can I deserialize vectors with variable length prefix with Bincode?
Your problem is that the source message is not encoded as bincode, so you are doing weird things to treat non-bincode data as if it was.
Serde is designed for creating serializers and deserializers for general-purpose formats, but your message is in a very specific format that can only be interpreted one way.
A library like nom is much more suitable for this kind of work, but it may be overkill considering how simple the format is and you can just parse it from the bytes directly:
use std::convert::TryInto;
fn main() {
let bytes = b"\xf9\xbe\xb4\xd9version\x00\x00\x00\x00\x00e\x00\x00\x00_\x1ai\xd2r\x11\x01\x00\x01\x00\x00\x00\x00\x00\x00\x00\xbc\x8f^T\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xc6\x1bd\t \x8d\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xcb\x00q\xc0 \x8d\x12\x805\xcb\xc9yS\xf8\x0f/Satoshi:0.9.3/\xcf\x05\x05\x00\x01";
let (magic_bytes, bytes) = bytes.split_at(4);
let magic = u32::from_le_bytes(magic_bytes.try_into().unwrap());
let (command_bytes, bytes) = bytes.split_at(12);
let command = command_bytes.try_into().unwrap();
let (length_bytes, bytes) = bytes.split_at(4);
let length = u32::from_le_bytes(length_bytes.try_into().unwrap());
let (checksum_bytes, bytes) = bytes.split_at(4);
let checksum = u32::from_le_bytes(checksum_bytes.try_into().unwrap());
let payload = bytes[..length as usize].to_vec();
let msg = Message {
magic,
command,
length,
checksum,
payload,
};
println!("{:?}", msg);
}
There are hundreds of cryptocurrency projects in Rust and there are many crates already written for handling cryptocurrency data structures. These crates are battle-tested and will have much better error-handling (my example above has none). As mentioned in the comments, you can perhaps look at the bitcoin crate.

Make serde only produce hex strings for human-readable serialiser?

I'm currently using serde-hex.
use serde_hex::{SerHex,StrictPfx,CompactPfx};
#[derive(Debug,PartialEq,Eq,Serialize,Deserialize)]
struct Foo {
#[serde(with = "SerHex::<StrictPfx>")]
bar: [u8;4],
#[serde(with = "SerHex::<CompactPfx>")]
bin: u64
}
fn it_works() {
let foo = Foo { bar: [0,1,2,3], bin: 16 };
let ser = serde_json::to_string(&foo).unwrap();
let exp = r#"{"bar":"0x00010203","bin":"0x10"}"#;
assert_eq!(ser,exp);
// this fails
let binser = bincode::serialize(&foo).unwrap();
let binexp: [u8; 12] = [0, 1, 2, 3, 16, 0, 0, 0, 0, 0, 0, 0];
assert_eq!(binser,binexp);
}
fails with:
thread 'wire::era::tests::it_works' panicked at 'assertion failed: `(left == right)`
left: `[10, 0, 0, 0, 0, 0, 0, 0, 48, 120, 48, 48, 48, 49, 48, 50, 48, 51, 4, 0, 0, 0, 0, 0, 0, 0, 48, 120, 49, 48]`,
right: `[0, 1, 2, 3, 16, 0, 0, 0, 0, 0, 0, 0]`', src/test.rs:20:9
because it has expanded values to hex strings for bincode.
I have many structs which I need to serialise with both serde_json and bincode. serde_hex does exactly what I need for JSON serialisation. When using bincode serde-hex still transforms arrays into hex strings, which is not wanted.
I notice that secp256k1 uses d.is_human_readable().
How can I make serde_hex apply only to serde_json and be ignored for bincode?
The implementation of a function usable with serde's with-attribute is mostly boilerplate and looks like this.
This only differentiates between human-readable and other formats. If you need more fine-grained control, you could branch on a thread-local variable instead.
fn serialize_hex<S>(v: &u64, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
if serializer.is_human_readable() {
serde_hex::SerHex::<serde_hex::StrictPfx>::serialize(v, serializer)
} else {
v.serialize(serializer)
}
}
// use like
// #[serde(serialize_with = "serialize_hex")]
// bin: u64
The snippet could be improved by turning the u64 into a generic.

How to transform a u128 integer to an Uuid with nom

I have binary data packet with a UUID (16 bytes), a 1 byte type field and 4 bytes containing a float value.
How to parse with nom and get as result a tuple (Uuid, u8, f32)?
use nom::{
combinator::map_res, number::complete::le_f32, number::complete::le_u128,
number::complete::le_u8, sequence::tuple, IResult,
};
use uuid;
fn detect(data: &[u8]) -> IResult<&[u8], (uuid::Uuid, u8, f32)> {
???
/* my attempt so far:
map_res(tuple((le_u128, le_u8, le_f32)), |tup| {
Ok((uuid::Uuid::from_u128(tup.0), tup.1, tup.2))
})(data)
*/
}
fn main() {
let pdu = [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 200, 65,
];
let result = detect(&pdu);
println!("{:?}", result);
}
[dependencies]
nom = "5"
uuid = "0.8"
In nom 5, you can just use classic rust way, example is self describing:
use nom::combinator;
use nom::number::complete as number;
fn detect(data: &[u8]) -> nom::IResult<&[u8], (uuid::Uuid, u8, f32)> {
let (data, uuid) = combinator::map(number::le_u128, uuid::Uuid::from_u128)(data)?;
let (data, type_field) = number::le_u8(data)?;
let (data, value) = number::le_f32(data)?;
Ok((data, (uuid, type_field, value)))
}
fn main() {
let pdu = [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 200, 65,
];
let result = detect(&pdu);
println!("{:?}", result);
}
I've just fixed your attempt. You can implement, ore use, a more useful error.
use nom::{
combinator::map_res, number::complete::le_f32, number::complete::le_u128,
number::complete::le_u8, sequence::tuple, IResult,
};
use std::fmt::{Display, Formatter};
use uuid;
#[derive(Debug)]
struct Error;
impl Display for Error {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "Error")
}
}
impl std::error::Error for Error {}
fn detect(data: &[u8]) -> IResult<&[u8], (uuid::Uuid, u8, f32)> {
map_res(
tuple((le_u128, le_u8, le_f32)),
|tup: (u128, u8, f32)| -> Result<(uuid::Uuid, u8, f32), Error> {
Ok((uuid::Uuid::from_u128(tup.0), tup.1, tup.2))
},
)(data)
}
fn main() {
let pdu = [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 200, 65,
];
let result = detect(&pdu);
println!("{:?}", result);
}

Resources