How to add/subtract an offset to/from NonNull<Opaque>? - rust

I provide two functions for managing memory:
unsafe extern "system" fn alloc<A: Alloc>(
size: usize,
alignment: usize,
) -> *mut c_void { ... }
unsafe extern "system" fn free<A: Alloc>(
memory: *mut c_void
) { ... }
Both functions internally use the allocator-api.
These signatures cannot be changed. The problem is that free does not ask for size and alignment, which is required for Alloc::dealloc. To get around this, alloc allocates some extra space for one Layout. free can now access this Layout to get the needed extra data.
Recently, the allocator-api changed and instead of *mut u8 it now uses NonNull<Opaque>. This is where my problem occurs.
core::alloc::Opaque:
An opaque, unsized type. Used for pointers to allocated memory. [...] Such pointers are similar to C’s void* type.
Opaque is not Sized, so the use of NonNull::as_ptr().add() and NonNull::as_ptr().sub() are forbidden.
Previously, I used something like this (for simplicity, I assume Alloc's functions to be static):
#![feature(allocator_api)]
#![no_std]
extern crate libc;
use core::alloc::{Alloc, Layout};
use libc::c_void;
unsafe extern "system" fn alloc<A: Alloc>(
size: usize,
alignment: usize,
) -> *mut c_void
{
let requested_layout =
Layout::from_size_align(size, alignment).unwrap();
let (layout, padding) = Layout::new::<Layout>()
.extend_packed(requested_layout)
.unwrap();
let ptr = A::alloc(layout).unwrap();
(ptr as *mut Layout).write(layout);
ptr.add(padding)
}
The last line is not possible anymore with NonNull<Opaque>. How I can get around this?

I'd probably write it like this, using NonNull::as_ptr to get a *mut Opaque and then cast that to different concrete types:
#![feature(allocator_api)]
#![no_std]
extern crate libc;
use core::alloc::{Alloc, Layout};
use libc::c_void;
unsafe fn alloc<A: Alloc>(allocator: &mut A, size: usize, alignment: usize) -> *mut c_void {
let requested_layout = Layout::from_size_align(size, alignment).expect("Invalid layout");
let (layout, _padding) = Layout::new::<Layout>()
.extend_packed(requested_layout)
.expect("Unable to create layout");
let ptr = allocator.alloc(layout).expect("Unable to allocate");
// Get a pointer to our layout storage
let raw = ptr.as_ptr() as *mut Layout;
// Save it
raw.write(layout);
// Skip over it
raw.offset(1) as *mut _
}
unsafe extern "system" fn alloc<A: Alloc>(
This makes no sense to me. The various FFI ABIs ("C", "system", etc.) have no way of specifying Rust generic types. It seems deeply incorrect for this function to be marked extern.
Layout::new::<Layout>().extend_packed(requested_layout)
This seems likely to be very broken. As the documentation for Layout::extend_packed states, emphasis mine:
the alignment of next is irrelevant, and is not incorporated at all into the resulting layout.
Your returned pointer doesn't seem to honor the alignment request.

Related

Keep vector aligned to page size [duplicate]

I'd like to align my heap memory to a specific alignment boundary. This boundary is known at compilation time. Anyhow, the Box abstraction doesn't allow me to specify a certain alignment. Writing my own Box-abstraction doesn't feel right too, because all of the Rust ecosystem uses Box already. What is a convenient way to achieve alignment for heap allocations?
PS: In my specific case I need page alignments.
If nightly is acceptable, the Allocator api provides a fairly convenient way to do this:
#![feature(allocator_api)]
use std::alloc::*;
use std::ptr::NonNull;
struct AlignedAlloc<const N: usize>;
unsafe impl<const N: usize> Allocator for AlignedAlloc<N> {
fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
Global.allocate(layout.align_to(N).unwrap())
}
unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout) {
Global.deallocate(ptr, layout.align_to(N).unwrap())
}
}
fn main() {
let val = Box::new_in(3, AlignedAlloc::<4096>);
let ptr: *const u8 = &*val;
println!(
"val:{}, alignment:{}",
val,
1 << (ptr as usize).trailing_zeros()
);
}
Playground
If you wanted you could also add support for using other allocators or choosing the value dynamically.
Edit: Come to think of it, this approach can also be used in stable via the GlobalAlloc trait and the #[global_allocator] attribute, but setting an allocator that way forces all allocations to be aligned to that same alignment, so it would be fine if you needed to ensure a relatively small alignment, but it is probably not a good idea for a page boundary alignment.
Edit 2: switched from using the System allocator to the Global allocator, because that's bundled in with the allocater_api feature, and it is a more sensible and predictable default.
Depending on what data you want to align, you can use repr (align) to specify the alignment of a type. You can either use that directly on your data or use a wrapper struct.
For example, the following will always be aligned on 16 bytes (whether on the heap, on the stack or inside another struct):
#[repr (C, align (16))]
struct AlignedBuffer([u8; 32]);
Posting an alternative answer just in case folks are curious. I'm not a Rust export but this seems to work:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=4c2d9c936e2755d1fc835d1e4663ec8b
#![feature(allocator_api)]
use std::{
alloc::{Layout, alloc_zeroed, handle_alloc_error}
};
const PAGE_SIZE: usize = 8192;
const NUM_PAGES: usize = 1000;
#[derive(Debug)]
struct Pool {
frames: Box<[[u8; PAGE_SIZE]; NUM_PAGES]>
}
impl Pool {
pub fn new() -> Self {
Self {
frames: unsafe {
let layout = Layout::from_size_align(NUM_PAGES * PAGE_SIZE, PAGE_SIZE).unwrap();
let ptr = alloc_zeroed(layout);
if ptr.is_null() {
handle_alloc_error(layout);
}
let ptr = ptr as *mut [[u8; PAGE_SIZE]; NUM_PAGES];
Box::from_raw(ptr)
},
}
}
}

How to set a free strategy?

I've encountered high memory usage (looks like a memory leak) in production environment (container in k8s), and want to check if it's because of the "MADV_FREE" behaviour.
Is there a way to change to use MADV_DONTNEED instead of MADV_FREE in rust?
Rust allows overriding the default allocator with the #[global_allocator] attribute
struct MyAllocator;
unsafe impl GlobalAlloc for MyAllocator {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
System.alloc(layout)
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
System.dealloc(ptr, layout)
}
}
#[global_allocator]
static GLOBAL: MyAllocator = MyAllocator;
You could use this to change the behavior of the deallocation to your needs.
Or possably use an existing crate that impliment allocators tha logs the allocations/deallocations such as tracing-allocator or logging-allocator.
#[global_allocator]
static GLOBAL: tracing_allocator::Allocator = tracing_allocator::Allocator{};
fn main() {
let f = File::create("trace.txt").unwrap();
tracing_allocator::Allocator::initialize(&f);
tracing_allocator::Allocator::activate();
}
(I have no experience with these crates so I have no idea what they uses for allocations)

How can I align memory on the heap (in a box) in a convenient way?

I'd like to align my heap memory to a specific alignment boundary. This boundary is known at compilation time. Anyhow, the Box abstraction doesn't allow me to specify a certain alignment. Writing my own Box-abstraction doesn't feel right too, because all of the Rust ecosystem uses Box already. What is a convenient way to achieve alignment for heap allocations?
PS: In my specific case I need page alignments.
If nightly is acceptable, the Allocator api provides a fairly convenient way to do this:
#![feature(allocator_api)]
use std::alloc::*;
use std::ptr::NonNull;
struct AlignedAlloc<const N: usize>;
unsafe impl<const N: usize> Allocator for AlignedAlloc<N> {
fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
Global.allocate(layout.align_to(N).unwrap())
}
unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout) {
Global.deallocate(ptr, layout.align_to(N).unwrap())
}
}
fn main() {
let val = Box::new_in(3, AlignedAlloc::<4096>);
let ptr: *const u8 = &*val;
println!(
"val:{}, alignment:{}",
val,
1 << (ptr as usize).trailing_zeros()
);
}
Playground
If you wanted you could also add support for using other allocators or choosing the value dynamically.
Edit: Come to think of it, this approach can also be used in stable via the GlobalAlloc trait and the #[global_allocator] attribute, but setting an allocator that way forces all allocations to be aligned to that same alignment, so it would be fine if you needed to ensure a relatively small alignment, but it is probably not a good idea for a page boundary alignment.
Edit 2: switched from using the System allocator to the Global allocator, because that's bundled in with the allocater_api feature, and it is a more sensible and predictable default.
Depending on what data you want to align, you can use repr (align) to specify the alignment of a type. You can either use that directly on your data or use a wrapper struct.
For example, the following will always be aligned on 16 bytes (whether on the heap, on the stack or inside another struct):
#[repr (C, align (16))]
struct AlignedBuffer([u8; 32]);
Posting an alternative answer just in case folks are curious. I'm not a Rust export but this seems to work:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=4c2d9c936e2755d1fc835d1e4663ec8b
#![feature(allocator_api)]
use std::{
alloc::{Layout, alloc_zeroed, handle_alloc_error}
};
const PAGE_SIZE: usize = 8192;
const NUM_PAGES: usize = 1000;
#[derive(Debug)]
struct Pool {
frames: Box<[[u8; PAGE_SIZE]; NUM_PAGES]>
}
impl Pool {
pub fn new() -> Self {
Self {
frames: unsafe {
let layout = Layout::from_size_align(NUM_PAGES * PAGE_SIZE, PAGE_SIZE).unwrap();
let ptr = alloc_zeroed(layout);
if ptr.is_null() {
handle_alloc_error(layout);
}
let ptr = ptr as *mut [[u8; PAGE_SIZE]; NUM_PAGES];
Box::from_raw(ptr)
},
}
}
}

How do I allocate a Vec<u8> that is aligned to the size of the cache line?

I need to allocate a buffer for reading from a File, but this buffer must be aligned to the size of the cache line (64 bytes). I am looking for a function somewhat like this for Vec:
pub fn with_capacity_and_aligned(capacity: usize, alignment: u8) -> Vec<T>
which would give me the 64 byte alignment that I need. This obviously doesn't exist, but there might be some equivalences (i.e. "hacks") that I don't know about.
So, when I use this function (which will give me the desired alignment), I could write this code safely:
#[repr(C)]
struct Header {
magic: u32,
some_data1: u32,
some_data2: u64,
}
let cache_line_size = 64; // bytes
let buffer: Vec<u8> = Vec::<u8>::with_capacity_and_alignment(some_size, cache_line_size);
match file.read_to_end(&mut buffer) {
Ok(_) => {
let header: Header = {
// and since the buffer is aligned to 64 bytes, I wont get any SEGFAULT
unsafe { transmute(buffer[0..(size_of::<Header>())]) }
};
}
}
and not get any panics because of alignment issues (like launching an instruction).
You can enforce the alignment of a type to a certain size using #[repr(align(...))]. We also use repr(C) to ensure that this type has the same memory layout as an array of bytes.
You can then create a vector of the aligned type and transform it to a vector of appropriate type:
use std::mem;
#[repr(C, align(64))]
struct AlignToSixtyFour([u8; 64]);
unsafe fn aligned_vec(n_bytes: usize) -> Vec<u8> {
// Lazy math to ensure we always have enough.
let n_units = (n_bytes / mem::size_of::<AlignToSixtyFour>()) + 1;
let mut aligned: Vec<AlignToSixtyFour> = Vec::with_capacity(n_units);
let ptr = aligned.as_mut_ptr();
let len_units = aligned.len();
let cap_units = aligned.capacity();
mem::forget(aligned);
Vec::from_raw_parts(
ptr as *mut u8,
len_units * mem::size_of::<AlignToSixtyFour>(),
cap_units * mem::size_of::<AlignToSixtyFour>(),
)
}
There are no guarantees that the Vec<u8> will remain aligned if you reallocate the data. This means that you cannot reallocate so you will need to know how big to allocate up front.
The function is unsafe for the same reason. When the type is dropped, the memory must be back to its original allocation, but this function cannot control that.
Thanks to BurntSushi5 for corrections and additions.
See also:
How can I align a struct to a specifed byte boundary?
Align struct to cache lines in Rust
How do I convert a Vec<T> to a Vec<U> without copying the vector?
Because of the limitations and unsafety above, another potential idea would be to allocate a big-enough buffer (maybe with some wiggle room), and then use align_to to get a properly aligned chunk. You could use the same AlignToSixtyFour type as above, and then convert the &[AlignToSixtyFour] into a &[u8] with similar logic.
This technique could be used to give out (optionally mutable) slices that are aligned. Since they are slices, you don't have to worry about the user reallocating or dropping them. This would allow you to wrap it up in a nicer type.
All that being said, I think that relying on alignment here is inappropriate for your actual goal of reading a struct from a file. Simply read the bytes (u32, u32, u64) and build the struct:
use byteorder::{LittleEndian, ReadBytesExt}; // 1.3.4
use std::{fs::File, io};
#[derive(Debug)]
struct Header {
magic: u32,
some_data1: u32,
some_data2: u64,
}
impl Header {
fn from_reader(mut reader: impl io::Read) -> Result<Self, Box<dyn std::error::Error>> {
let magic = reader.read_u32::<LittleEndian>()?;
let some_data1 = reader.read_u32::<LittleEndian>()?;
let some_data2 = reader.read_u64::<LittleEndian>()?;
Ok(Self {
magic,
some_data1,
some_data2,
})
}
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut f = File::open("/etc/hosts")?;
let header = Header::from_reader(&mut f)?;
println!("{:?}", header);
Ok(())
}
See also:
How to read a struct from a file in Rust?
Is this the most natural way to read structs from a binary file?
Can I take a byte array and deserialize it into a struct?
Transmuting u8 buffer to struct in Rust

How do I use cbindgen to return and free a Box<Vec<_>>?

I have a struct returned to C code from Rust. I have no idea if it's a good way to do things, but it does work for rebuilding the struct and freeing memory without leaks.
#[repr(C)]
pub struct s {
// ...
}
#[repr(C)]
#[allow(clippy::box_vec)]
pub struct s_arr {
arr: *const s,
n: i8,
vec: Box<Vec<s>>,
}
/// Frees memory that was returned to C code
pub unsafe extern "C" fn free_s_arr(a: *mut s_arr) {
Box::from_raw(s_arr);
}
/// Generates an array for the C code
pub unsafe extern "C" fn gen_s_arr() -> *mut s_arr {
let many_s: Vec<s> = Vec::new();
// ... logic here
Box::into_raw(Box::new(s_arr {
arr: many_s.as_mut_ptr(),
n: many_s.len() as i8,
vec: many_s,
}))
}
The C header is currently written by hand, but I wanted to try out cbindgen. The manual C definition for s_arr is:
struct s_arr {
struct s *arr;
int8_t n;
void *_;
};
cbindgen generates the following for s_arr:
typedef struct Box_Vec_s Box_Vec_s;
typedef struct s_arr {
const s *arr;
int8_t n;
Box_Vec_s vec;
} s_arr;
This doesn't work since struct Box_Vec_s is not defined. Ideally I would just want to override the cbindgen type generated for vec to make it void * since it requires no code changes and thus no additional testing, but I am open to other suggestions.
I have looked through the cbindgen documentation, though not the examples, and couldn't find anything.
Your question is a bit unclear, but I think that if I understood you right, you're confusing two things and being led down a dark alley as a result.
In C, a dynamically-sized array, as you probably know, is identified by two things:
Its starting position, as a pointer
Its length
Rust follows the same convention - a Vec<_>, below the hood, shares the same structure (well, almost. It has a capacity as well, but that's beside the point).
Passing the boxed vector on top of a pointer is not only overkill, but extremely unwise. FFI bindings may be smart, but they're not smart enough to deal with a boxed complex type most of the time.
To solve this, we're going to simplify your bindings. I've added a single element in struct S to show you how it works. I've also cleaned up your FFI boundary:
#[repr(C)]
#[no_mangle]
pub struct S {
foo: u8
}
#[repr(C)]
pub struct s_arr {
arr: *mut S,
n: usize,
cap: usize
}
// Retrieve the vector back
pub unsafe extern "C" fn recombine_s_arr(ptr: *mut S, n: usize, cap: usize) -> Vec<S> {
Vec::from_raw_parts(ptr, n, cap)
}
#[no_mangle]
pub unsafe extern "C" fn gen_s_arr() -> s_arr {
let mut many_s: Vec<S> = Vec::new();
let output = s_arr {
arr: many_s.as_mut_ptr(),
n: many_s.len(),
cap: many_s.capacity()
};
std::mem::forget(many_s);
output
}
With this, cbindgen returns the expected header definitions:
typedef struct {
uint8_t foo;
} so58311426S;
typedef struct {
so58311426S *arr;
uintptr_t n;
uintptr_t cap;
} so58311426s_arr;
so58311426s_arr gen_s_arr(void);
This allows us to call gen_s_arr() from either C or Rust and retrieve a struct that is usable across both parts of the FFI boundary (so58311426s_arr). This struct contains all we need to be able to modify our array of S (well, so58311426S according to cbindgen).
When passing through FFI, you need to make sure of a few simple things:
You cannot pass raw boxes or non-primitive types; you will almost universally need to convert down to a set of pointers or change your definitions to accomodate (as I have done here)
You most definitely do not pass raw vectors. At most, you pass a slice, as that is a primitive type (see the point above).
You make sure to std::mem::forget() whatever you do not want to deallocate, and make sure to remember to deallocate it or reform it somewhere else.
I will edit this question in an hour; I have a plane to get on to. Let me know if any of this needs clarifications and I'll get to it once I'm in the right country :-)

Resources