Memory management with wasm-bindgen

Memory management with wasm-bindgen - rust

I am trying to understand wasm-bindgen memory management and infer correct usage to ensure no memory is leaked.
In Exporting a struct to JS of the wasm-bindgen documentation, it says:
The free function is required to be invoked to deallocate resources on the Rust side of things.
However, in Exported struct types, there is no namedStruct.free(); after let namedStruct = return_named_struct(42);. How does that fit together?

Related

With Serde JSON why does "{c:[{}]}" cause a heap allocation when deserializing into a RawValue struct?

I'm trying to understand how heap allocations work in Serde JSON.
Why does the following code make one heap allocation? I am expecting no allocations as the value of c is a borrowed serde_json::value::RawValue using `#[serde(borrow)].
#[derive(Deserialize, Debug)]
struct MyStruct<'a> {
#[serde(borrow)]
c: &'a serde_json::value::RawValue,
}
fn main() {
let msg = r#"{"c":[{}]}"#;
// One unexpected allocation here.
serde_json::from_str::<MyStruct>(msg).unwrap();
}
Note using {"c":[2, 3]} for example instead of {"c":[{}]} will result in no allocations.
How can I make it so there are zero allocations when deserializing into MyStruct?
Rust playground link.

You can't avoid the allocations. The parser needs to allocate some memory on the heap because JSON objects and arrays can nest arbitrarily deeply, and it needs to keep track of what type of value it's currently parsing.
I modified your program to panic on the first heap allocation that happens during parsing (because I'm too lazy to debug and there's no debugger on the playground). The backtrace shows where the heap allocation is coming from. The key frame is this one:
14: serde_json::de::Deserializer<R>::ignore_value
at ./.cargo/registry/src/github.com-1ecc6299db9ec823/serde_json-1.0.69/src/de.rs:1049:21
You can be sure that your MyStruct won't point to heap allocated memory, because your struct contains a shared reference and not an owned value (and the input is a static string). In order for c to refer to heap allocated memory, serde_json would have to leak it (and it doesn't do that; it would be pretty bad if every parse could leak memory!).

The allocation is internal to serde_json and will be freed before returning. There is an internal scratch area which can require allocations (source).
The README of serde_json also mentions the fact that it relies on alloc support, so allocations are not unexpected:
As long as there is a memory allocator, it is possible to use serde_json without the rest of the Rust standard library.
If you need to work without allocations you can instead try serde-json-core which is also mentioned in that README.

Allocate large struct from mmap using MALLOC_MMAP_THRESHOLD_

I have a big struct (~200Mb) that I deserialize from a large JSON file from Java using serde_json and this deserialization occurs again when new data is available. The struct has Vecs, a HashMap of strings and structs of strings, etc.
While looking at the man page for mallopt(3), I found that environment variable MALLOC_MMAP_THRESHOLD_ can be set to control how much allocation has to be requested for malloc to allocate using mmap. I want to allocate my struct from mmap because the heap is causing memory fragmentation during reloads. I want the old deallocated memory (the one that is replaced with a new deserialized struct) to be returned to the system immediately (and not kept around by the one of the malloc arenas).
Is there a way to achieve this? Should I be using some other data format?

Find total size of struct at runtime

Is there any way to calculate the total stack and heap size of a struct at runtime?
As far as I can tell, std::mem::{size_of, size_of_val} only work for stack-allocated values, but a struct might contain heap-allocated buffers, too (e.g. Vec).

Servo was using the heapsize crate to measure the size of heap allocations during program.
You can call the heap_size_of function to measure the allocated heap size by jemalloc.
Be aware that you can get different results with different allocators.
Regarding Github: "This crate is not maintained and is no longer used by Servo. At the time of writing, Servo uses internal malloc_size_of instead."
You can either use heapsize crate or you can check the implementation details of malloc_size_of as well

What goes on the stack and what goes on the heap in Rust?

I am really confused about Rust's system of memory allocation.
In Java you use new to allocate memory on the heap. In C you use malloc(), everything else goes on the stack.
I thought, in Rust, Box<T> allocates memory on the heap but after reading "Defining Our Own Smart Pointer" section in chapter 15.2 in The Rust Programming Language it seems like MyBox<T> doesn't have any special annotation to make the value of T live on the heap.
What exactly goes on the stack and what goes on the heap?
Is the implementation of MyBox<T> essentially the same as Box<T>?
If the implementations are identical, what makes T stored on the heap rather than the stack?
If the implementations aren't identical what makes a Box<T> allocate memory on the heap?

This is hard to say. Usually Rust avoid allocating anything on the heap. Never will the compiler do an implicit allocation on the heap, but may library functions can do it for you. At least anything that is dynamically sized (eg. Vec<T>) will need something on the heap under the hood, for the rest, the documentation should hint it.
Note that even in C, many functions can do heap allocation without an explicit call to malloc. Eg. I've had to debug a memory leak recently where a developer called getaddrinfo without a corresponding freeaddrinfo, ignoring that this function allocates memory on the heap. This class of bugs should be really rare in Rust however, thanks to RAII.
Not at all! The book is simplifying things here to avoid details not important for this section.
—
Box is a compiler built-in. Under the hood what allocates the memory is an allocator defined as in liballoc. You can think of this allocator as providing malloc-like functionality. In practice, the default allocator uses jemalloc on most targets, it is also possible to use a custom allocator, for example the alloc_system crate uses the system's malloc/realloc/free functions to build its allocator.

Just for setting the record straight, malloc is heap, new is heap. Stack allocation doesn't require any malloc, and the malloc can just be liberated through a free, the same for the new can only be liberated by a deleted; otherwise you get a process leak. In those languages, the developer manages what is stack allocation and heap allocation.
For the question, the Box::new goes in the heap. The memory is claimed back with Box::Drop (you don't see this), unless you transfer ownership.
To complement --> types that have a known size at compile time are stored entirely on the stack. This tells you that Rust is managing what goes on the stack and what goes on the heap. Rust ownership and borrowing should clarify this.

In Node.js, when is data stored on the heap?

In C you explicitly ask for and manage memory on the heap, so interaction with the heap is well defined/apparent. How do you reason about this in Node.js?
Sub-questions:
where/how are functions stored?
are there certain objects/primitives that always get stored on the heap? (e.g. buffers)
does data migrate from the stack over to the heap? when?
References to good resources on this subject would also be appreciated, thanks.

You don't care about stack vs heap nor about freeing memory. It happens automatically since Node.js offers a precise tracing garbage collector. Some data is stored in GC heap. Some data is on the stack. You can't generally tell because it depends on optimisations performed by the JIT-compiler at runtime. Profiling tools might provide application-specific insight.
As for resources other than memory (e.g. files and sockets), use finally:
var file = open(…);
try {
…
} finally {
close(file);
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Memory management with wasm-bindgen - rust

Related

With Serde JSON why does "{c:[{}]}" cause a heap allocation when deserializing into a RawValue struct?

Allocate large struct from mmap using MALLOC_MMAP_THRESHOLD_

Find total size of struct at runtime

What goes on the stack and what goes on the heap in Rust?

In Node.js, when is data stored on the heap?

Categories

Resources