Understanding memory allocation in `wasm-bindgen` - rust

Context
I'd like to improve my understanding of memory management and the usage of function free with wasm-bindgen.
My understanding is that anything allocated must be freed. In particular this should hold for values passed from JS into Rust.
Example
In the docs, there is an example where a &str is passed from JS to Rust, and indeed is freed at the end, as expected.
export function greet(arg0) {
const [ptr0, len0] = passStringToWasm(arg0);
try {
const ret = wasm.greet(ptr0, len0);
const ptr = wasm.__wbindgen_boxed_str_ptr(ret);
const len = wasm.__wbindgen_boxed_str_len(ret);
const realRet = getStringFromWasm(ptr, len);
wasm.__wbindgen_boxed_str_free(ret);
return realRet;
} finally {
wasm.__wbindgen_free(ptr0, len0);
}
}
Counterexample?
However, in the following function, which is also generated by wasm-bindgen, a Uint32Array is passed to wasm, whereby memory is allocated, but seemingly this memory is not freed at the end of the function.
module.exports.generate_key = function(seed, sid, pid, counterparties, t) {
_assertClass(seed, RngSeed);
var ptr0 = seed.ptr;
seed.ptr = 0;
const ptr1 = passArray32ToWasm0(counterparties, wasm.__wbindgen_malloc);
const len1 = WASM_VECTOR_LEN;
const ret = wasm.generate_key(ptr0, sid, pid, ptr1, len1, t);
return takeObject(ret);
};
This is generated from the following rust code.
#[wasm_bindgen]
pub async fn generate_key(
seed: RngSeed,
sid: u64,
pid: u32,
counterparties: &[u32],
t: usize,
) -> Result<KeyShare, Error> {
...
}
Question
Why is the memory allocated in passArray32ToWasm0 seemingly not freed here?

Related

Lifetimes of thread::scope()'s variables and spawned threads

I'm trying to do some parallel processing on a list of values:
fn process_list(list: Vec<f32>) -> Vec<f32> { // Block F
let chunk_size = 100;
let output_list = vec![0.0f32;list.len()];
thread::scope(|s| { // Block S
(0..list.len()).collect::<Vec<_>>().chunks(chunk_size).for_each(|chunk| { // Block T
s.spawn(|| {
chunk.into_iter().for_each(|&idx| {
let value = calc_value(list[idx]);
unsafe {
let out = (output_list.as_ptr() as *mut f32).offset(idx as isize);
*out = value;
}
});
});
});
});
output_list
}
The API says that thread::scope() only returns once each thread spawned by the scope it creates has finished. However, the compiler is telling me that the temporary range object (0..list.len()) is deconstructed while the threads that use it might still be alive.
I'm curious about what's actually happening under the hood. My intuition tells me that each thread spawned and variable created within Block S would both have Block S's lifetime. But clearly the threads have a lifetime longer than Block S.
Why aren't these lifetimes be the same?
Is the best practice here to create a variable in Block F that serves the purpose of the temporary like so:
fn process_list(list: Vec<f32>) -> Vec<f32> { // Block F
let chunk_size = 100;
let output_list = vec![0.0f32;list.len()];
let range = (0..list.len()).collect::<Vec<_>>();
thread::scope(|s| { // Block S
range.chunks(chunk_size).for_each(|chunk| { // Block T
s.spawn(|| {
chunk.into_iter().for_each(|&idx| {
let value = calc_value(list[idx]);
unsafe {
let out = (output_list.as_ptr() as *mut f32).offset(idx as isize);
*out = value;
}
});
});
});
});
output_list
}
You don't have to ask SO to find out what happens under the hood of std functions, their source code is readily available, but since you asked I'll try to explain a little more in the comments.
pub fn scope<'env, F, T>(f: F) -> T
where
F: for<'scope> FnOnce(&'scope Scope<'scope, 'env>) -> T,
{
// `Scope` creation not very relevant to the issue at hand
let scope = ...;
// here your 'Block S' gets run and returns.
let result = catch_unwind(AssertUnwindSafe(|| f(&scope)));
// the above is just a fancy way of calling `f` while catching any panics.
// but we're waiting for threads to finish running until here
while scope.data.num_running_threads.load(Ordering::Acquire) != 0 {
park();
}
// further not so relevant cleanup code
//...
}
So as you can see your assumption that 'Block S' will stick around as long as any of the threads is wrong.
And yes the solution is to capture the owner of the chunks before you call thread::scope.
There is also no reason to dive into unsafe for your example, you can use zip instead:
fn process_list(list: Vec<f32>) -> Vec<f32> { // Block F
let chunk_size = 100;
let mut output_list = vec![0.0f32; list.len()];
let mut zipped = list
.into_iter()
.zip(output_list.iter_mut())
.collect::<Vec<_>>();
thread::scope(|s| { // Block S
zipped.chunks_mut(chunk_size).for_each(|chunk| { // Block T
s.spawn(|| {
chunk.into_iter().for_each(|(v, out)| {
let value = calc_value(*v);
**out = value;
});
});
});
});
output_list
}

Understanding wasm-bindgen returned objects memory management

I'm trying to return a typed object from Rust to Typescript, and I ideally don't want to have to manually manage memory (performance is not the highest priority). While doing this, I'm trying to understand the generated JS.
Rust:
#[wasm_bindgen]
pub fn retjs() -> JsValue {
// in my actual project I serialize a struct with `JsValue::from_serde`
JsValue::from_str("")
}
#[wasm_bindgen]
pub fn retstruct() -> A {
A {
a: "".to_owned(),
};
}
#[wasm_bindgen]
pub struct A {
a: String,
}
#[wasm_bindgen]
impl A {
#[wasm_bindgen]
pub fn foo(&self) -> String {
return self.a.clone();
}
}
Generated JS:
export function retjs() {
const ret = wasm.retjs();
return takeObject(ret);
}
function takeObject(idx) {
const ret = getObject(idx);
dropObject(idx);
return ret;
}
function dropObject(idx) {
if (idx < 36) return;
heap[idx] = heap_next;
heap_next = idx;
}
export function retstruct() {
const ret = wasm.retstruct();
return A.__wrap(ret);
}
// generated class A omitted for brevity. Here relevant that it has a `free()` method.
Is my understanding correct that with the JsValue the memory is completely freed automatically? And that I've to do this manually with the struct? Is there a way to get around that?
I basically just want type safety in Typescript, so when I update the struct in Rust the Typescript code is automatically updated.

Guaranteeing struct layout for wasm in vector of structs

When using a vector of #[wasm_bindgen] structs in javascript, is there a guarantee that the order of the struct's fields will be maintained so that bytes in wasm memory can be correctly interpreted in JS? I would like to be able to have a vector of structs in Rust and deterministically be able to reconstruct the structs on the JS side without having to serialize across the wasm boundary with serde. I have the following Rust code:
#[wasm_bindgen]
pub struct S {
a: u8,
b: u16,
}
#[wasm_bindgen]
pub struct Container {
ss: Vec<S>,
}
#[wasm_bindgen]
impl Container {
pub fn new() -> Self {
let ss = (0..10_u8)
.map(|i| S {
a: i,
b: (2 * i).into(),
})
.collect();
Self { ss }
}
pub fn items_ptr(&self) -> *const S {
self.ss.as_ptr()
}
pub fn item_size(&self) -> usize {
std::mem::size_of::<S>()
}
pub fn buffer_len(&self) -> usize {
self.ss.len() * self.item_size()
}
}
Now, on the JS side, I have the following:
import { memory } from "rust_wasm/rust_wasm.wasm";
import * as wasm from "rust_wasm";
const container = wasm.Container.new();
const items = new Uint8Array(memory.buffer, container.items_ptr(), container.buffer_len());
function getItemBytes(n) {
const itemSize = container.item_size();
const start = n * itemSize;
const end = start + itemSize;
return items.slice(start, end);
}
With the above code, I can obtain the (four, due to alignment) bytes that comprise an S in JS. Now, my question: how do I interpret those bytes to reconstruct the S (the fields a and b) in JS? In theory, I'd like to be able to say something like the following:
const itemBytes = getItemBytes(3);
const a = itemBytes[0];
const b = itemBytes[1] << 8 + itemBytes[2]
But of course this relies on the fact that the layout in memory of an instance of S matches its definition, that u16 is big endian, etc. Is it possible to get Rust to enforce these properties so that I can safely interpret bytes in JS?

How to return a string (or similar) from Rust in WebAssembly?

I created a small Wasm file from this Rust code:
#[no_mangle]
pub fn hello() -> &'static str {
"hello from rust"
}
It builds and the hello function can be called from JS:
<!DOCTYPE html>
<html>
<body>
<script>
fetch('main.wasm')
.then(response => response.arrayBuffer())
.then(bytes => WebAssembly.instantiate(bytes, {}))
.then(results => {
alert(results.instance.exports.hello());
});
</script>
</body>
</html>
My problem is that the alert displays "undefined". If I return a i32, it works and displays the i32. I also tried to return a String but it does not work (it still displays "undefined").
Is there a way to return a string from Rust in WebAssembly? What type should I use?
WebAssembly only supports a few numeric types, which is all that can be returned via an exported function.
When you compile to WebAssembly, your string will be held in the module's linear memory. In order to read this string from the hosting JavaScript, you need to return a reference to its location in memory, and the length of the string, i.e. two integers. This allows you to read the string from memory.
You use this same technique regardless of whichever language you are compiling to WebAssembly. How can I return a JavaScript string from a WebAssembly function provides a detailed background to the problem.
With Rust specifically, you need to make use of the Foreign Function Interface (FFI), using the CString type as follows:
use std::ffi::CString;
use std::os::raw::c_char;
static HELLO: &'static str = "hello from rust";
#[no_mangle]
pub fn get_hello() -> *mut c_char {
let s = CString::new(HELLO).unwrap();
s.into_raw()
}
#[no_mangle]
pub fn get_hello_len() -> usize {
HELLO.len()
}
The above code exports two functions, get_hello which returns a reference to the string, and get_hello_len which returns its length.
With the above code compiled to a wasm module, the string can be accessed as follows:
const res = await fetch('chip8.wasm');
const buffer = await res.arrayBuffer();
const module = await WebAssembly.compile(buffer);
const instance = await WebAssembly.instantiate(module);
// obtain the module memory
const linearMemory = instance.exports.memory;
// create a buffer starting at the reference to the exported string
const offset = instance.exports.get_hello();
const stringBuffer = new Uint8Array(linearMemory.buffer, offset,
instance.exports.get_hello_len());
// create a string from this buffer
let str = '';
for (let i=0; i<stringBuffer.length; i++) {
str += String.fromCharCode(stringBuffer[i]);
}
console.log(str);
The C equivalent can be seen in action in a WasmFiddle.
You cannot directly return a Rust String or an &str. Instead allocate and return a raw byte pointer containing the data which has to be then encoded as a JS string on the JavaScript side.
You can take a look at the SHA1 example here.
The functions of interest are in
demos/bundle.js - copyCStr
demos/sha1/sha1-digest.rs - digest
For more examples: https://www.hellorust.com/demos/sha1/index.html
Most examples I saw copy the string twice. First on the WASM side, into CString or by shrinking the Vec to its capacity, and then on the JS side while decoding the UTF-8.
Given that we often use WASM for the sake of the speed, I sought to implement a version that would reuse the Rust vector.
use std::collections::HashMap;
/// Byte vectors shared with JavaScript.
///
/// A map from payload's memory location to `Vec<u8>`.
///
/// In order to deallocate memory in Rust we need not just the memory location but also it's size.
/// In case of strings and vectors the freed size is capacity.
/// Keeping the vector around allows us not to change it's capacity.
///
/// Not thread-safe (assuming that we're running WASM from the single JavaScript thread).
static mut SHARED_VECS: Option<HashMap<u32, Vec<u8>>> = None;
extern "C" {
fn console_log(rs: *const u8);
fn console_log_8859_1(rs: *const u8);
}
#[no_mangle]
pub fn init() {
unsafe { SHARED_VECS = Some(HashMap::new()) }
}
#[no_mangle]
pub fn vec_len(payload: *const u8) -> u32 {
unsafe {
SHARED_VECS
.as_ref()
.unwrap()
.get(&(payload as u32))
.unwrap()
.len() as u32
}
}
pub fn vec2js<V: Into<Vec<u8>>>(v: V) -> *const u8 {
let v = v.into();
let payload = v.as_ptr();
unsafe {
SHARED_VECS.as_mut().unwrap().insert(payload as u32, v);
}
payload
}
#[no_mangle]
pub extern "C" fn free_vec(payload: *const u8) {
unsafe {
SHARED_VECS.as_mut().unwrap().remove(&(payload as u32));
}
}
#[no_mangle]
pub fn start() {
unsafe {
console_log(vec2js(format!("Hello again!")));
console_log_8859_1(vec2js(b"ASCII string." as &[u8]));
}
}
And the JavaScript part:
(function (iif) {
function rs2js (mod, rs, utfLabel = 'utf-8') {
const view = new Uint8Array (mod.memory.buffer, rs, mod.vec_len (rs))
const utf8dec = new TextDecoder (utfLabel)
const utf8 = utf8dec.decode (view)
mod.free_vec (rs)
return utf8}
function loadWasm (cache) {
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/instantiateStreaming
WebAssembly.instantiateStreaming (fetch ('main.wasm', {cache: cache ? "default" : "no-cache"}), {env: {
console_log: function (rs) {if (window.console) console.log ('main]', rs2js (iif.main, rs))},
console_log_8859_1: function (rs) {if (window.console) console.log ('main]', rs2js (iif.main, rs, 'iso-8859-1'))}
}}) .then (results => {
const exports = results.instance.exports
exports.init()
iif.main = exports
iif.main.start()})}
// Hot code reloading.
if (window.location.hostname == '127.0.0.1' && window.location.port == '43080') {
window.setInterval (
function() {
// Check if the WASM was updated.
fetch ('main.wasm.lm', {cache: "no-cache"}) .then (r => r.text()) .then (lm => {
lm = lm.trim()
if (/^\d+$/.test (lm) && lm != iif.lm) {
iif.lm = lm
loadWasm (false)}})},
200)
} else loadWasm (true)
} (window.iif = window.iif || {}))
The trade-off here is that we're using HashMap in the WASM which might increase the size unless HashMap is already required.
An interesting alternative would be to use the tables to share the (payload, length, capacity) triplet with the JavaScript and get it back when it is time to free the string. But I don't know how to use the tables yet.
P.S. Sometimes we don't want to allocate the Vec in the first place.
In this case we can move the memory tracking to JavaScript:
extern "C" {
fn new_js_string(utf8: *const u8, len: i32) -> i32;
fn console_log(js: i32);
}
fn rs2js(rs: &str) -> i32 {
assert!(rs.len() < i32::max_value() as usize);
unsafe { new_js_string(rs.as_ptr(), rs.len() as i32) }
}
#[no_mangle]
pub fn start() {
unsafe {
console_log(rs2js("Hello again!"));
}
}
(function (iif) {
function loadWasm (cache) {
WebAssembly.instantiateStreaming (fetch ('main.wasm', {cache: cache ? "default" : "no-cache"}), {env: {
new_js_string: function (utf8, len) {
const view = new Uint8Array (iif.main.memory.buffer, utf8, len)
const utf8dec = new TextDecoder ('utf-8')
const decoded = utf8dec.decode (view)
let stringId = iif.lastStringId
while (typeof iif.strings[stringId] !== 'undefined') stringId += 1
if (stringId > 2147483647) { // Can't easily pass more than that through WASM.
stringId = -2147483648
while (typeof iif.strings[stringId] !== 'undefined') stringId += 1
if (stringId > 2147483647) throw new Error ('Out of string IDs!')}
iif.strings[stringId] = decoded
return iif.lastStringId = stringId},
console_log: function (js) {
if (window.console) console.log ('main]', iif.strings[js])
delete iif.strings[js]}
}}) .then (results => {
iif.main = results.instance.exports
iif.main.start()})}
loadWasm (true)
} (window.iif = window.iif || {strings: {}, lastStringId: 1}))
Return String from Rust fn to ReactApp
TLDR:
Add to main.rs use wasm_bindgen::prelude::*;
Use JsValue as the return type of fn.
Return from fn JSValue::from_str("string")
Create Rust Library for Function
mkdir ~/hello-from-rust-demo \
cd ~/hello-from-rust-demo \
cargo new --lib hello-wasm \
cargo add wasm-bindgen \
code ~/hello-from-rust-demo/hello-wasm/src/lib.rs
use wasm_bindgen::prelude::*;
#[wasm_bindgen]
pub fn hello(name: &str) -> JsValue {
JsValue::from_str(&format!("Hello from rust, {}!", name))
}
cargo install wasm-pack \
wasm-pack build --target web
Create React App to Demo Rust Function
cd ~/hello-from-rust-demo \
yarn create react-app hello \
cd hello \
yarn add ../hello-wasm/pkg \
code ~/hello-from-rust-demo/hello/src/App.js
App.js
import init, { hello } from 'hello-wasm';
import { useState, useEffect } from 'react';
function App() {
const [hello, setHello] = useState(null);
useEffect(() => {
init().then(() => {
setHello(()=>hello);
})
}, []);
return (
hello("Human")
);
}
export default App;
Start App
yarn start
Hello from rust, Human!

How can you allocate a raw mutable pointer in stable Rust?

I was trying to build a naive implementation of a custom String-like struct with small string optimization. Now that unions are allowed in stable Rust, I came up with the following code:
struct Large {
capacity: usize,
buffer: *mut u8,
}
struct Small([u8; 16]);
union Container {
large: Large,
small: Small,
}
struct MyString {
len: usize,
container: Container,
}
I can't seem to find a way how to allocate that *mut u8. Is it possible to do in stable Rust? It looks like using alloc::heap would work, but it is only available in nightly.
As of Rust 1.28, std::alloc::alloc is stable.
Here is an example which shows in general how it can be used.
use std::{
alloc::{self, Layout},
cmp, mem, ptr, slice, str,
};
// This really should **not** be copied
#[derive(Copy, Clone)]
struct Large {
capacity: usize,
buffer: *mut u8,
}
// This really should **not** be copied
#[derive(Copy, Clone, Default)]
struct Small([u8; 16]);
union Container {
large: Large,
small: Small,
}
struct MyString {
len: usize,
container: Container,
}
impl MyString {
fn new() -> Self {
MyString {
len: 0,
container: Container {
small: Small::default(),
},
}
}
fn as_buf(&self) -> &[u8] {
unsafe {
if self.len <= 16 {
&self.container.small.0[..self.len]
} else {
slice::from_raw_parts(self.container.large.buffer, self.len)
}
}
}
pub fn as_str(&self) -> &str {
unsafe { str::from_utf8_unchecked(self.as_buf()) }
}
// Not actually UTF-8 safe!
fn push(&mut self, c: u8) {
unsafe {
use cmp::Ordering::*;
match self.len.cmp(&16) {
Less => {
self.container.small.0[self.len] = c;
}
Equal => {
let capacity = 17;
let layout = Layout::from_size_align(capacity, mem::align_of::<u8>())
.expect("Bad layout");
let buffer = alloc::alloc(layout);
{
let buf = self.as_buf();
ptr::copy_nonoverlapping(buf.as_ptr(), buffer, buf.len());
}
self.container.large = Large { capacity, buffer };
*self.container.large.buffer.offset(self.len as isize) = c;
}
Greater => {
let Large {
mut capacity,
buffer,
} = self.container.large;
capacity += 1;
let layout = Layout::from_size_align(capacity, mem::align_of::<u8>())
.expect("Bad layout");
let buffer = alloc::realloc(buffer, layout, capacity);
self.container.large = Large { capacity, buffer };
*self.container.large.buffer.offset(self.len as isize) = c;
}
}
self.len += 1;
}
}
}
impl Drop for MyString {
fn drop(&mut self) {
unsafe {
if self.len > 16 {
let Large { capacity, buffer } = self.container.large;
let layout =
Layout::from_size_align(capacity, mem::align_of::<u8>()).expect("Bad layout");
alloc::dealloc(buffer, layout);
}
}
}
}
fn main() {
let mut s = MyString::new();
for _ in 0..32 {
s.push(b'a');
println!("{}", s.as_str());
}
}
I believe this code to be correct with respect to allocations, but not for anything else. Like all unsafe code, verify it yourself. It's also completely inefficient as it reallocates for every additional character.
If you'd like to allocate a collection of u8 instead of a single u8, you can create a Vec and then convert it into the constituent pieces, such as by calling as_mut_ptr:
use std::mem;
fn main() {
let mut foo = vec![0; 1024]; // or Vec::<u8>::with_capacity(1024);
let ptr = foo.as_mut_ptr();
let cap = foo.capacity();
let len = foo.len();
mem::forget(foo); // Avoid calling the destructor!
let foo_again = unsafe { Vec::from_raw_parts(ptr, len, cap) }; // Rebuild it to drop it
// Do *NOT* use `ptr` / `cap` / `len` anymore
}
Re allocating is a bit of a pain though; you'd have to convert back to a Vec and do the whole dance forwards and backwards
That being said, your Large struct seems to be missing a length, which would be distinct from capacity. You could just use a Vec instead of writing it out. I see now it's up a bit in the hierarchy.
I wonder if having a full String wouldn't be a lot easier, even if it were a bit less efficient in that the length is double-counted...
union Container {
large: String,
small: Small,
}
See also:
What is the right way to allocate data to pass to an FFI call?
How do I use the Rust memory allocator for a C library that can be provided an allocator?
What about Box::into_raw()?
struct TypeMatches(*mut u8);
TypeMatches(Box::into_raw(Box::new(0u8)));
But it's difficult to tell from your code snippet if this is what you really need. You probably want a real allocator, and you could use libc::malloc with an as cast, as in this example.
There's a memalloc crate which provides a stable allocation API. It's implemented by allocating memory with Vec::with_capacity, then extracting the pointer:
let vec = Vec::with_capacity(cap);
let ptr = buf.as_mut_ptr();
mem::forget(vec);
To free the memory, use Vec::from_raw_parts.

Resources