How to manipulate buffer in Rust? - rust

a few weeks ago I got interested in Rust. So far I have only read online tutorials and wonder how to manipulate buffer memory in Rust. Let's say I have C code like this:
int main()
{
char buffer[] = { 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, 0xaa };
int a = *(int*)&buffer[0];
a = 0xdeadc0de;
short b = *(short*)&buffer[4];
b = 0xbadf;
*(int*)&buffer[0] = a;
*(short*)&buffer[4] = b;
//buffer memory: de c0 ad de df ba 77 88 99 aa
return 0;
}
Could anyone write this in Rust please? I think there's no casting in Rust, right?

Direct buffer manipulation through transmuted references is considered unsafe in Rust. You can of course use the unsafe keyword for writing into memory directly, but that would negate the whole safety advantage of using Rust.
You can create an u32, convert it to a [u8] array and then write that into the buffer. But you cannot safely get a &u32 reference from a buffer.
fn main() {
let mut buffer = vec![0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, 0xaa];
let a: u32 = 0xdeadc0de;
let a_bytes = a.to_le_bytes();
buffer[0..4].copy_from_slice(&a_bytes);
let b: u16 = 0xbadf;
let b_bytes = b.to_le_bytes();
buffer[4..6].copy_from_slice(&b_bytes);
println!("{:x?}", buffer);
}
[de, c0, ad, de, df, ba, 77, 88, 99, aa]
Just for reference, this is how this would look like with unsafe code.
I highly discourage this solution, though.
fn main() {
let mut buffer: Vec<u8> = vec![0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, 0xaa];
unsafe {
let ptr = buffer.as_mut_ptr();
let a = ptr.offset(0) as *mut u32;
*a = 0xdeadc0de;
let b = ptr.offset(4) as *mut u16;
*b = 0xbadf;
}
println!("{:x?}", buffer);
}
[de, c0, ad, de, df, ba, 77, 88, 99, aa]
Note that the unsafe solution does not behave exactly like the safe solution. It will flip the bytes if compiled on a big-endian architecture.
This is what to_le_bytes prevents.

Related

How to implement "take while, but at most N characters" in nom?

How can I give nom's take_while an upper limit on the number of characters it should match? I want to take characters according to a certain condition, but at most N.
Since this will be a very performance critical part of the parser, I'd like to avoid using a bunch of individual take(1usize). Partly, because it feels awkward having to deal with single element slices one-by-one, but also because the compiler probably cannot know at compile time that they must have a size of 1, i.e., it will likely have to generate bound checks or size assertions, right?
Conceptually, a take_while with an upper limit would feel more appropriate. I was hoping I can do the counting with a mutable loop variable myself like so:
let mut i = 0;
let (input, _) = take_while(|c| {
i += 1;
c >= 0x80 && i < 9
})(input)?;
In fact, I could even extract the necessary information in the closure, which most like should result in a very efficient code generation. (The goal is to implement a certain kind of VarInt LSB encoding, i.e., I could update a local mutable variable x = x + (if last_byte { c } else { c & 0x7f }) << 7.)
Unfortunately take_while seems to allow only Fn and not FnMut which probably means this is impossible (why the limitation?). What else could I do to implement that nicely?
Use a Cell to make your closure have interior mutability. Then it can have mutable state, but still implement Fn:
let i = Cell::new(0);
let (input, _) = take_while(|c| {
i.set(i.get() + 1);
c > 0x80 && i.get() < 9
})(input)?;
There is a nom-function take_while_m_n:
const N: usize = ...
fn take_native(input: &[u8]) -> IResult<&[u8], &[u8]> {
take_while_m_n(0, N, |c| c > 0x80)(input)
}
However, with quick benchmark it seems to be remarkably slower than the Cell -answer (or benchmark optimizes wrongly, since the Cell -version takes only 1ns/iter compared to 13ns/iter for the take_while_m_n).
#![feature(test)]
extern crate test;
use std::cell::Cell;
use nom::{
bytes::complete::{take_while, take_while_m_n},
IResult,
};
fn take_cell(input: &[u8]) -> IResult<&[u8], &[u8]> {
let i = Cell::new(0);
let (input, output) = take_while(|c| {
i.set(i.get() + 1);
c > 0x80 && i.get() < 5
})(input)?;
Ok((input, output))
}
fn take_native(input: &[u8]) -> IResult<&[u8], &[u8]> {
take_while_m_n(0, 4, |c| c > 0x80)(input)
}
#[cfg(test)]
mod tests {
use super::*;
const INPUT: &[u8] = &[
0x81, 0x82, 0x83, 0x84, 0x81, 0x82, 0x83, 0x84, 0x81, 0x82, 0x83, 0x84, 0x81, 0x82, 0x83,
0x84, 0x81, 0x82, 0x83, 0x84, 0x81, 0x82, 0x83, 0x84, 0x81, 0x82, 0x83, 0x84,
];
#[bench]
fn bench_cell(b: &mut test::Bencher) {
assert_eq!(take_cell(INPUT).unwrap().1, &[0x81, 0x82, 0x83, 0x84]);
b.iter(|| take_cell(INPUT).unwrap());
}
#[bench]
fn bench_native(b: &mut test::Bencher) {
assert_eq!(take_native(INPUT).unwrap().1, &[0x81, 0x82, 0x83, 0x84]);
b.iter(|| take_native(INPUT).unwrap());
}
}

How to execute raw instructions from a memory buffer in Rust?

I'm attempting to make a buffer of memory executable, then execute it in Rust. I've gotten all the way until I need to cast the raw executable bytes as code/instructions. You can see a working example in C below.
Extra details:
Rust 1.34
Linux
CC 8.2.1
unsigned char code[] = {
0x55, // push %rbp
0x48, 0x89, 0xe5, // mov %rsp,%rbp
0xb8, 0x37, 0x00, 0x00, 0x00, // mov $0x37,%eax
0xc9, // leaveq
0xc3 // retq
};
void reflect(const unsigned char *code) {
void *buf;
/* copy code to executable buffer */
buf = mmap(0, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANON,-1,0);
memcpy(buf, code, sizeof(code));
((void (*) (void))buf)();
}
extern crate mmap;
use mmap::{MapOption, MemoryMap};
unsafe fn reflect(instructions: &[u8]) {
let map = MemoryMap::new(
instructions.len(),
&[
MapOption::MapAddr(0 as *mut u8),
MapOption::MapOffset(0),
MapOption::MapFd(-1),
MapOption::MapReadable,
MapOption::MapWritable,
MapOption::MapExecutable,
MapOption::MapNonStandardFlags(libc::MAP_ANON),
MapOption::MapNonStandardFlags(libc::MAP_PRIVATE),
],
)
.unwrap();
std::ptr::copy(instructions.as_ptr(), map.data(), instructions.len());
// How to cast into extern "C" fn() ?
}
Use mem::transmute to cast a raw pointer to a function pointer type.
use std::mem;
let func: unsafe extern "C" fn() = mem::transmute(map.data());
func();

How to use ioctl + nix macros to get a variable size buffer

This is related to How to use nix's ioctl? but it is not the same question.
I want to retrieve a variable size buffer. There is another ioctl that tells me that I need to read X bytes. The C header tells me the following too:
#define HID_MAX_DESCRIPTOR_SIZE 4096
#define HIDIOCGRDESC _IOR('H', 0x02, struct hidraw_report_descriptor)
struct hidraw_report_descriptor {
__u32 size;
__u8 value[HID_MAX_DESCRIPTOR_SIZE];
};
I define the macro in the following way:
ioctl_read_buf!(hid_read_descr, b'H', 0x02, u8);
And later call:
let mut desc_raw = [0u8; 4 + 4096];
let err = unsafe { hid_read_descr(file.as_raw_fd(), &mut desc_raw); };
When doing this, desc_raw is full of zeros. I would have expected the first 4 bytes to contain size based on the struct definition.
The alternative, does not seem to work either
ioctl_read!(hid_read_descr2, b'H', 0x02, [u8; 4+4096]);
// ...
let mut desc_raw = [0xFFu8; 4 + 4096];
let err = unsafe { hid_read_descr2(file.as_raw_fd(), &mut desc_raw); };
In both cases, I have tried initializing desc_raw with 0xFF and after the call, it seems untouched.
Am I using the ioctl_read_buf macro incorrectly?
Now that Digikata has thoughtfully provided enough code to drive the program...
Am I using the ioctl_read_buf macro incorrectly?
I'd say that using it at all is incorrect here. You don't want to read an array of data, you want to read a single instance of a specific type. That's what ioctl_read! is for.
We define a repr(C) struct that mimics the C definition. This ensures that important details like alignment, padding, field ordering, etc., all match one-to-one with the code we are calling.
We can then construct an uninitialized instance of this struct and pass it to the newly-defined function.
use libc; // 0.2.66
use nix::ioctl_read; // 0.16.1
use std::{
fs::OpenOptions,
mem::MaybeUninit,
os::unix::{fs::OpenOptionsExt, io::AsRawFd},
};
const HID_MAX_DESCRIPTOR_SIZE: usize = 4096;
#[repr(C)]
pub struct hidraw_report_descriptor {
size: u32,
value: [u8; HID_MAX_DESCRIPTOR_SIZE],
}
ioctl_read!(hid_read_sz, b'H', 0x01, libc::c_int);
ioctl_read!(hid_read_descr, b'H', 0x02, hidraw_report_descriptor);
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file = OpenOptions::new()
.read(true)
.write(true)
.custom_flags(libc::O_NONBLOCK)
.open("/dev/hidraw0")?;
unsafe {
let fd = file.as_raw_fd();
let mut size = 0;
hid_read_sz(fd, &mut size)?;
println!("{}", size);
let mut desc_raw = MaybeUninit::<hidraw_report_descriptor>::uninit();
(*desc_raw.as_mut_ptr()).size = size as u32;
hid_read_descr(file.as_raw_fd(), desc_raw.as_mut_ptr())?;
let desc_raw = desc_raw.assume_init();
let data = &desc_raw.value[..desc_raw.size as usize];
println!("{:02x?}", data);
}
Ok(())
}
I think you've got a couple of issues here. Some on the Rust side, and some with using the HIDIOCGRDESC ioctl incorrectly. If you look in a Linux kernel distribution at the hidraw.txt and hid-example.c code, the use of the struct is as follows:
struct hidraw_report_descriptor rpt_desc;
memset(&rpt_desc, 0x0, sizeof(rpt_desc));
/* Get Report Descriptor */
rpt_desc.size = desc_size;
res = ioctl(fd, HIDIOCGRDESC, &rpt_desc);
desc_size comes from a previous HIDIOCGRDESCSIZE ioctl call. Unless I fill in the correct size parameter, the ioctl returns an error (ENOTTY or EINVAL).
There are also issues with passing the O_NONBLOCK flag to open a HID device without using libc::open. I ended up with this:
#[macro_use]
extern crate nix;
extern crate libc;
ioctl_read!(hid_read_sz, b'H', 0x01, i32);
ioctl_read_buf!(hid_read_descr, b'H', 0x02, u8);
fn main() {
// see /usr/include/linux/hidraw.h
// and hid-example.c
extern crate ffi;
use std::ffi::CString;
let fname = CString::new("/dev/hidraw0").unwrap();
let fd = unsafe { libc::open(fname.as_ptr(), libc::O_NONBLOCK | libc::O_RDWR) };
let mut sz = 0i32;
let err = unsafe { hid_read_sz(fd, &mut sz) };
println!("{:?} size is {:?}", err, sz);
let mut desc_raw = [0x0u8; 4 + 4096];
// sz on my system ended up as 52 - this handjams in the value
// w/ a little endian swizzle into the C struct .size field, but
// really we should properly define the struct
desc_raw[0] = sz as u8;
let err = unsafe { hid_read_descr(fd, &mut desc_raw) };
println!("{:?}", err);
for (i, &b) in desc_raw.iter().enumerate() {
if b != 0 {
println!("{:4} {:?}", i, b);
}
}
}
In the end, you shouldn't be sizing the struct to a variable size, the ioctl header indicates there is a fixed max expected. The variability is all on the system ioctl to deal with, it just needs the expected size hint from another ioctl call.

Why does a generic function replicating C's fread for unsigned integers always return zero?

I am trying to read in binary 16-bit machine instructions from a 16-bit architecture (the exact nature of that is irrelevant here), and print them back out as hexadecimal values. In C, I found this simple by using the fread function to read 16 bits into a uint16_t.
I figured that I would try to replicate fread in Rust. It seems to be reasonably trivial if I can know ahead-of-time the exact size of the variable that is being read into, and I had that working specifically for 16 bits.
I decided that I wanted to try to make the fread function generic over the various built-in unsigned integer types. For that I came up with the below function, using some traits from the Num crate:
fn fread<T>(
buffer: &mut T,
element_count: usize,
stream: &mut BufReader<File>,
) -> Result<usize, std::io::Error>
where
T: num::PrimInt + num::Unsigned,
{
let type_size = std::mem::size_of::<T>();
let mut buf = Vec::with_capacity(element_count * type_size);
let buf_slice = buf.as_mut_slice();
let bytes_read = match stream.read_exact(buf_slice) {
Ok(()) => element_count * type_size,
Err(ref e) if e.kind() == std::io::ErrorKind::UnexpectedEof => 0,
Err(e) => panic!("{}", e),
};
*buffer = buf_slice
.iter()
.enumerate()
.map(|(i, &b)| {
let mut holder2: T = num::zero();
holder2 = holder2 | T::from(b).expect("Casting from u8 to T failed");
holder2 << ((type_size - i) * 8)
})
.fold(num::zero(), |acc, h| acc | h);
Ok(bytes_read)
}
The issue is that when I call it in the main function, I seem to always get 0x00 back out, but the number of bytes read that is returned by the function is always 2, so that the program enters an infinite loop:
extern crate num;
use std::fs::File;
use std::io::BufReader;
use std::io::prelude::Read;
fn main() -> Result<(), std::io::Error> {
let cmd_line_args = std::env::args().collect::<Vec<_>>();
let f = File::open(&cmd_line_args[1])?;
let mut reader = BufReader::new(f);
let mut instructions: Vec<u16> = Vec::new();
let mut next_instruction: u16 = 0;
fread(&mut next_instruction, 1, &mut reader)?;
let base_address = next_instruction;
while fread(&mut next_instruction, 1, &mut reader)? > 0 {
instructions.push(next_instruction);
}
println!("{:#04x}", base_address);
for i in instructions {
println!("0x{:04x}", i);
}
Ok(())
}
It appears to me that I'm somehow never reading anything from the file, so the function always just returns the number of bytes it was supposed to read. I'm clearly not using something correctly here, but I'm honestly unsure what I'm doing wrong.
This is compiled on Rust 1.26 stable for Windows if that matters.
What am I doing wrong, and what should I do differently to replicate fread? I realise that this is probably a case of the XY problem (in that there's almost certainly a better Rust way to repeatedly read some bytes from a file and pack them into one unsigned integer), but I'm really curious as to what I'm doing wrong here.
Your problem is that this line:
let mut buf = Vec::with_capacity(element_count * type_size);
creates a zero-length vector, even though it allocates memory for element_count * type_size bytes. Therefore you are asking stream.read_exact to read zero bytes. One way to fix this is to replace the above line with:
let mut buf = vec![0; element_count * type_size];
Side note: when the read succeeds, bytes_read receives the number of bytes you expected to read, not the number of bytes you actually read. You should probably use std::mem::size_of_val (buf_slice) to get the true byte count.
in that there's almost certainly a better Rust way to repeatedly read some bytes from a file and pack them into one unsigned integer
Yes, use the byteorder crate. This requires no unneeded heap allocation (the Vec in the original code):
extern crate byteorder;
use byteorder::{LittleEndian, ReadBytesExt};
use std::{
fs::File, io::{self, BufReader, Read},
};
fn read_instructions_to_end<R>(mut rdr: R) -> io::Result<Vec<u16>>
where
R: Read,
{
let mut instructions = Vec::new();
loop {
match rdr.read_u16::<LittleEndian>() {
Ok(instruction) => instructions.push(instruction),
Err(e) => {
return if e.kind() == std::io::ErrorKind::UnexpectedEof {
Ok(instructions)
} else {
Err(e)
}
}
}
}
}
fn main() -> Result<(), std::io::Error> {
let name = std::env::args().skip(1).next().expect("no file name");
let f = File::open(name)?;
let mut f = BufReader::new(f);
let base_address = f.read_u16::<LittleEndian>()?;
let instructions = read_instructions_to_end(f)?;
println!("{:#04x}", base_address);
for i in &instructions {
println!("0x{:04x}", i);
}
Ok(())
}

How to read a struct from a file in Rust?

Is there a way I can read a structure directly from a file in Rust? My code is:
use std::fs::File;
struct Configuration {
item1: u8,
item2: u16,
item3: i32,
item4: [char; 8],
}
fn main() {
let file = File::open("config_file").unwrap();
let mut config: Configuration;
// How to read struct from file?
}
How would I read my configuration directly into config from the file? Is this even possible?
Here you go:
use std::io::Read;
use std::mem;
use std::slice;
#[repr(C, packed)]
#[derive(Debug, Copy, Clone)]
struct Configuration {
item1: u8,
item2: u16,
item3: i32,
item4: [char; 8],
}
const CONFIG_DATA: &[u8] = &[
0xfd, // u8
0xb4, 0x50, // u16
0x45, 0xcd, 0x3c, 0x15, // i32
0x71, 0x3c, 0x87, 0xff, // char
0xe8, 0x5d, 0x20, 0xe7, // char
0x5f, 0x38, 0x05, 0x4a, // char
0xc4, 0x58, 0x8f, 0xdc, // char
0x67, 0x1d, 0xb4, 0x64, // char
0xf2, 0xc5, 0x2c, 0x15, // char
0xd8, 0x9a, 0xae, 0x23, // char
0x7d, 0xce, 0x4b, 0xeb, // char
];
fn main() {
let mut buffer = CONFIG_DATA;
let mut config: Configuration = unsafe { mem::zeroed() };
let config_size = mem::size_of::<Configuration>();
unsafe {
let config_slice = slice::from_raw_parts_mut(&mut config as *mut _ as *mut u8, config_size);
// `read_exact()` comes from `Read` impl for `&[u8]`
buffer.read_exact(config_slice).unwrap();
}
println!("Read structure: {:#?}", config);
}
Try it here (Updated for Rust 1.38)
You need to be careful, however, as unsafe code is, well, unsafe. After the slice::from_raw_parts_mut() invocation, there exist two mutable handles to the same data at the same time, which is a violation of Rust aliasing rules. Therefore you would want to keep the mutable slice created out of a structure for the shortest possible time. I also assume that you know about endianness issues - the code above is by no means portable, and will return different results if compiled and run on different kinds of machines (ARM vs x86, for example).
If you can choose the format and you want a compact binary one, consider using bincode. Otherwise, if you need e.g. to parse some pre-defined binary structure, byteorder crate is the way to go.
As Vladimir Matveev mentions, using the byteorder crate is often the best solution. This way, you account for endianness issues, don't have to deal with any unsafe code, or worry about alignment or padding:
use byteorder::{LittleEndian, ReadBytesExt}; // 1.2.7
use std::{
fs::File,
io::{self, Read},
};
struct Configuration {
item1: u8,
item2: u16,
item3: i32,
}
impl Configuration {
fn from_reader(mut rdr: impl Read) -> io::Result<Self> {
let item1 = rdr.read_u8()?;
let item2 = rdr.read_u16::<LittleEndian>()?;
let item3 = rdr.read_i32::<LittleEndian>()?;
Ok(Configuration {
item1,
item2,
item3,
})
}
}
fn main() {
let file = File::open("/dev/random").unwrap();
let config = Configuration::from_reader(file);
// How to read struct from file?
}
I've ignored the [char; 8] for a few reasons:
Rust's char is a 32-bit type and it's unclear if your file has actual Unicode code points or C-style 8-bit values.
You can't easily parse an array with byteorder, you have to parse N values and then build the array yourself.
The following code does not take into account any endianness or padding issues and is intended to be used with POD types. struct Configuration should be safe in this case.
Here is a function that can read a struct (of a POD type) from a file:
use std::io::{self, Read};
use std::slice;
fn read_struct<T, R: Read>(mut read: R) -> io::Result<T> {
let num_bytes = ::std::mem::size_of::<T>();
unsafe {
let mut s = ::std::mem::uninitialized();
let buffer = slice::from_raw_parts_mut(&mut s as *mut T as *mut u8, num_bytes);
match read.read_exact(buffer) {
Ok(()) => Ok(s),
Err(e) => {
::std::mem::forget(s);
Err(e)
}
}
}
}
// use
// read_struct::<Configuration>(reader)
If you want to read a sequence of structs from a file, you can execute read_struct multiple times or read all the file at once:
use std::fs::{self, File};
use std::io::BufReader;
use std::path::Path;
fn read_structs<T, P: AsRef<Path>>(path: P) -> io::Result<Vec<T>> {
let path = path.as_ref();
let struct_size = ::std::mem::size_of::<T>();
let num_bytes = fs::metadata(path)?.len() as usize;
let num_structs = num_bytes / struct_size;
let mut reader = BufReader::new(File::open(path)?);
let mut r = Vec::<T>::with_capacity(num_structs);
unsafe {
let buffer = slice::from_raw_parts_mut(r.as_mut_ptr() as *mut u8, num_bytes);
reader.read_exact(buffer)?;
r.set_len(num_structs);
}
Ok(r)
}
// use
// read_structs::<StructName, _>("path/to/file"))

Resources