LabVIEW crashed when Rust deallocates CString - rust

I am playing with LabVIEW 2019 64bit and Rust 1.64, but LabVIEW crashed on a very simple code.
What I am trying to do is passing a Hello from Rust string to LabVIEW and then deallocate it back in Rust.
use libc::c_char;
use std::ffi::CString;
#[no_mangle]
pub extern "cdecl" fn hello() -> *const c_char {
CString::new("Hello from Rust")
.expect("failed to create c string")
.into_raw()
}
#[no_mangle]
pub extern "cdecl" fn bye(s: *mut c_char) {
unsafe {drop(CString::from_raw(s));}
}
LabView can display correct string but freezes and then crashed when calling bye(s).
However, if I manually disable bye(s) function call, everything is all right, but it violates Rust suggestion:
alloc::ffi::c_str::CString
pub fn into_raw(self) -> *mut c_char
Consumes the CString and transfers ownership of the string to a C caller.
The pointer which this function returns must be returned to Rust and reconstituted using CString::from_raw to be properly deallocated. Specifically, one should not use the standard C free() function to deallocate this string.
Failure to call CString::from_raw will lead to a memory leak.
What is the correct way to perform this simple task?
Thanks to #srm , LabVIEW is working fine now. Here is the working block diagram:

The problem you have is that LabVIEW is receiving the C string and converting it to an L string (very different memory signature). The pink wire in LabVIEW is a pointer to a char[N] where N is the length of the string + 4 bytes... the first 4 bytes of the array is the length of the string. It's similar to a Pascal string but with 4 byte length instead of 1 byte length. The data structure that LV is consuming isn't the same one that you're passing out to bye(). You never get your hands on that raw pointer... with your current Call Library setup. So let's change your setup.
Background: Remember that LabVIEW has, essentially, the same rules for memory management that Rust has -- namely that LabVIEW knows to allocate and deallocate data on a fixed schedule, and it expects to be in control of the memory management. LabVIEW does NOT have a garbage collector. It analyzes the dataflow and makes decisions about where to deterministically release data. But it also doesn't assume that data goes out of scope -- there is no "scope" for LabVIEW VIs. Ideally for LabVIEW, that pink wire will remain allocated for the next call of the VI on the presumption that the next call will also need the same string buffer.
Solution: If you want Rust to allocate the string, you need to tell LabVIEW that what is being returned is just an address. So tell LabVIEW the return type is a pointer-size integer, not a string. Also return the length of the string.
On the diagram, call your hello() function to return the integer and the length of the string. Then have another Call Library function to invoke the built-in function MoveBlock.
https://www.ni.com/docs/en-US/bundle/labview/page/lvexcode/moveblock.html
Set the library name to "LabVIEW" -- no file extension. Then make the first parameter be your pointer-sized integer. The second parameter is a LabVIEW array of bytes that you call Initialize Array and init to N characters where N is the length of the string -- configure the Call Library node to do arrays by pointer. The third parameter is the length of the string.
MoveBlock will copy the bytes out of the C string allocated by Rust into the L string allocated by LabVIEW. You can then pass the pointer-sized integer into your bye() function.

Related

Is there a difference between a borrowed integer and a copy?

I know that a String mainly consists of a pointer that contains the address to its allocated place in the heap memory. Rust prohibits any copies of Strings to avoid double free errors, so it introduced borrowing, where the code basically only copies the pointer value without copying the value in the heap.
However, integer types are stored in the stack and hence do not have a pointer. Yet it is still possible to create a reference to an integer:
let i: i64 = 42;
let j = &i;
Since an integer contains no reference to the heap, isn't a borrowed integer simply a regular copy of it? E.g. is there any difference between j = i and j = &i?
Consider the case if the reference were mutable:
let mut i: u64 = 42;
let j = &mut i;
*j = 5;
println!("{}", i);
5
It should be obvious from this demonstration that j is not simply a copy. It does reference and therefore modify the original i.
Integer types are stored in the stack and hence do not have a pointer.
Not sure where you got that idea from. If it exists in memory, then it has an address within that memory, and therefore you can have a pointer (or reference) that points to it. The properties of a u64 do not change depending on where it is.
The comparison to strings may be tripping you up:
let s = String::from("hello world");
let s_ref: &String = &s;
let str_ref: &str = s.as_str();
If you have a String variable s, and take a reference to it, s_ref, it does not point directly to the heap, it points to the variable s on the stack. There is a slice-type str that represents a region of utf8-encoded bytes, which a String holds on the heap. You can get a reference to that region of memory directly on the heap by getting it via .as_str()/.as_ref() or by converting the &String into a &str via deref coercion.
But in the case of u64 vs &u64, there isn't much of a practical difference between the two except the latter incurs an extra level of indirection in the generated code and you may have to worry about lifetime constraints. Because of that, its usually better to use copies of integer types if given the choice. You'd still see references to integers though if using them through some generic interface.
Yes, there is a difference between the two. The fact that an integer lives in the heap, or in the stack, doesn't change the fact that it's somewhere in the memory, so it has an address. A pointer being just that address, even integers can have pointers to. And, indeed, if you try using a pointer to an integer as an integer, you'll have a problem, because of type mismatch.
The difference between a String and a number type such as i64 is that i64: Copy, which means that you can turn a &i64 into a i64 just by "copying" the values (as opposed to calling a dedicated function that knows how to appropriately clone stuff, such as String::clone, which comes from Clone::clone). This means that Rust will allow implicit copying of an integer, so, from this perspective, a pointer to an integer is as permissive as an integer in itself.

How to properly convert a Rust string into a C string?

I'm trying to use a C library that requires me to give it strings (const char*) as function arguments, among other things.
I tried writing 2 functions for this (Rust -> C, C -> Rust), so I'm already aware CString and CStr exist, but couldn't get them working for the 1st conversion.
I tried using solutions from Stack Overflow and examples from the docs for both of them but the result always ends up garbage (mostly random characters and on one occasion the same result as in this post).
// My understanding is that C strings have a \0 terminator, which I need to append to input
// without having the variable created deallocated/destroyed/you name it by Rust right after
// I don't have any other explanation for the data becoming garbage if I clone it.
// Also, this conversion works if i manually append \0 to the end of the string at the constructor
pub unsafe fn convert_str(input: &str) -> *const c_char {
let c_str = input.as_ptr() as *const c_char;
return c_str
}
// Works, at least for now
pub unsafe fn cstr_to_str(c_buf: *const i8) -> &'static str {
let cstr = CStr::from_ptr(c_buf);
return cstr.to_str().expect("success");
}
The resulting implementation acts like this:
let pointer = convert_str("Hello");
let result = cstr_to_str(pointer);
println!("Hello becomes {}", result);
// Output:
// Hello becomes HelloHello becomescannot access a Thread Local Storage value during or after destruction/rustc/fe5b1...
// LOOKsrc/main.rsHello this is windowFailed to create GLFW window.
How do I fix this? Is there a better way to do this I'm not seeing?
Rust strings don't have a \0 terminator, so to append one, convert_str must necessarily allocate memory (since it can't modify input - even in unsafe code, it can't know whether there's space for one more byte in the memory allocated for input)
If you're gonna wrangle C strings, you're gonna have to do C style manual management, i.e. together with returning a char*, convert_str also has to return the implicit obligation to free the string to the caller. Said differently, convert_str can't deallocate the buffer it must allocate. (If it did, using the pointer it returns would be a use after free, which indeed results in garbage.)
So your code might look like this:
Allocate a new CString in convert_str with CString::new(input).unwrap() and make sure its internal buffer doesn't get dropped at the end of the function with .into_raw()
Deallocate the return value of convert_str when you're done using it with drop(CString::from_raw(pointer)).
Playground

How to get Rust Vec buffer?

I want to dynamically allocate an array at runtime, so I use Vec to implement it. And I want to use a raw pointer to point the array address, like this:
fn alloc(&mut page: Page) {
page.data = vec![0; page.page_size].as_mut_ptr();//it's a Vec<u8>
}
I want to know if the pointer directly points to the vec buffer, and the length is exactly the page.page_size?
The effect I want to have is just like the following C code:
void alloc(Page* page) {
page->data = (u8*)malloc(page->page_size);
}
As far as I know the only guarantee you get from vec in this case, when it is created by new() or vec![] macro is it will have reserved at least page_size bytes of data. To reserve exact amount of bytes create it with Vec::with_capacity().
When working with raw pointers it is the responsibility of the programmer to ensure the data lives long enough. In this example the vec will only live within alloc so when your function return the address it will free the buffer at the same time.

How does exactly rust handle return values

I've got question about my code:
pub fn get_signals(path: &String) -> Vec<Vec<f64>> {
let mut rdr = csv::ReaderBuilder::new().delimiter(b';').from_path(&path).unwrap();
let mut signals: Vec<Vec<f64>> = Vec::new();
for record in rdr.records(){
let mut r = record.unwrap();
for (i, value) in r.iter().enumerate(){
match signals.get(i){
Some(_) => {},
None => signals.push(Vec::new())
}
signals[i].push(value.parse::<f64>().unwrap());
}
}
signals
}
How exactly does Rust handle return? When I, for example write let signals = get_signal(&"data.csv".to_string()); does Rust assume I want a new instance of Vec(copies all the data) or just pass a pointer to previously allocated(via Vec::new()) memory? What is the most efficient way to do this? Also, what happens with rdr? I assume, given Rusts memory safety, it's destroyed.
How exactly does Rust handle return?
The only guarantee Rust, the language, makes is that values are never cloned without an explicit .clone() in the code. Therefore, from a semantic point of view, the value is moved which will not require allocating memory.
does Rust assume I want a new instance of Vec(copies all the data) or just pass a pointer to previously allocated (via Vec::new()) memory?
This is implementation specific, and part of the ABI (Application Binary Interface). The Rust ABI is not formalized, and not stable, so there is no standard describing it and no guarantee about this holding up.
Furthermore, this will depend on whether the function call is inlined or not. If the function call is inlined, there is of course no return any longer yet the same behavior should be observed.
For small values, they should be returned via a register (or a couple of registers).
For larger values:
the caller should reserve memory on the stack (properly sized and aligned) and pass a pointer to this area to the callee,
the callee will then construct the return value at the place pointed to, so that by the time it returns the value exists there for the caller to use.
Note: by the size here is the size on the stack, as returned by std::mem::size_of; so size_of::<Vec<_>>() == 24 on 64-bits architecture.
What is the most efficient way to do this?
Returning is as efficient as it gets for a single call.
If however you find yourself in a situation where, say, you want to read a file line by line, then it makes sense to reuse the buffer from one call to the other which can be accomplished either by:
taking a &mut references to the buffer (String or Vec<u8> say),
or taking a buffer by value and returning it.
The point being to avoid memory allocations.

Should I pass a mutable reference or transfer ownership of a variable in the context of FFI?

I have a program that utilizes a Windows API via a C FFI (via winapi-rs). One of the functions expects a pointer to a pointer to a string as an output parameter. The function will store its result into this string. I'm using a variable of type WideCString for this string.
Can I "just" pass in a mutable ref to a ref to a string into this function (inside an unsafe block) or should I rather use a functionality like .into_raw() and .from_raw() that also moves the ownership of the variable to the C function?
Both versions compile and work but I'm wondering whether I'm buying any disadvantages with the direct way.
Here are the relevant lines from my code utilizing .into_raw and .from_raw.
let mut widestr: WideCString = WideCString::from_str("test").unwrap(); //this is the string where the result should be stored
let mut security_descriptor_ptr: winnt::LPWSTR = widestr.into_raw();
let rtrn3 = unsafe {
advapi32::ConvertSecurityDescriptorToStringSecurityDescriptorW(sd_buffer.as_mut_ptr() as *mut std::os::raw::c_void,
1,
winnt::DACL_SECURITY_INFORMATION,
&mut security_descriptor_ptr,
ptr::null_mut())
};
if rtrn3 == 0 {
match IOError::last_os_error().raw_os_error() {
Some(1008) => println!("Need to fix this errror in get_acl_of_file."), // Do nothing. No idea, why this error occurs
Some(e) => panic!("Unknown OS error in get_acl_of_file {}", e),
None => panic!("That should not happen in get_acl_of_file!"),
}
}
let mut rtr: WideCString = unsafe{WideCString::from_raw(security_descriptor_ptr)};
The description of this parameter in MSDN says:
A pointer to a variable that receives a pointer to a null-terminated security descriptor string. For a description of the string format, see Security Descriptor String Format. To free the returned buffer, call the LocalFree function.
I am expecting the function to change the value of the variable. Doesn't that - per definition - mean that I'm moving ownership?
I am expecting the function to change the value of the variable. Doesn't that - per definition - mean that I'm moving ownership?
No. One key way to think about ownership is: who is responsible for destroying the value when you are done with it.
Competent C APIs (and Microsoft generally falls into this category) document expected ownership rules, although sometimes the words are oblique or assume some level of outside knowledge. This particular function says:
To free the returned buffer, call the LocalFree function.
That means that the ConvertSecurityDescriptorToStringSecurityDescriptorW is going to perform some kind of allocation and return that to the user. Checking out the function declaration, you can also see that they document that parameter as being an "out" parameter:
_Out_ LPTSTR *StringSecurityDescriptor,
Why is it done this way? Because the caller doesn't know how much memory to allocate to store the string 1!
Normally, you'd pass a reference to uninitialized memory into the function which must then initialize it for you.
This compiles, but you didn't provide enough context to actually call it, so who knows if it works:
extern crate advapi32;
extern crate winapi;
extern crate widestring;
use std::{mem, ptr, io};
use winapi::{winnt, PSECURITY_DESCRIPTOR};
use widestring::WideCString;
fn foo(sd_buffer: PSECURITY_DESCRIPTOR) -> WideCString {
let mut security_descriptor = unsafe { mem::uninitialized() };
let retval = unsafe {
advapi32::ConvertSecurityDescriptorToStringSecurityDescriptorW(
sd_buffer,
1,
winnt::DACL_SECURITY_INFORMATION,
&mut security_descriptor,
ptr::null_mut()
)
};
if retval == 0 {
match io::Error::last_os_error().raw_os_error() {
Some(1008) => println!("Need to fix this errror in get_acl_of_file."), // Do nothing. No idea, why this error occurs
Some(e) => panic!("Unknown OS error in get_acl_of_file {}", e),
None => panic!("That should not happen in get_acl_of_file!"),
}
}
unsafe { WideCString::from_raw(security_descriptor) }
}
fn main() {
let x = foo(ptr::null_mut());
println!("{:?}", x);
}
[dependencies]
winapi = { git = "https://github.com/nils-tekampe/winapi-rs/", rev = "1bb62e2c22d0f5833cfa9eec1db2c9cfc2a4a303" }
advapi32-sys = { git = "https://github.com/nils-tekampe/winapi-rs/", rev = "1bb62e2c22d0f5833cfa9eec1db2c9cfc2a4a303" }
widestring = "*"
Answering your questions directly:
Can I "just" pass in a mutable ref to a ref to a string into this function (inside an unsafe block) or should I rather use a functionality like .into_raw() and .from_raw() that also moves the ownership of the variable to the C function?
Neither. The function doesn't expect you to pass it a pointer to a string, it wants you to pass a pointer to a place where it can put a string.
I also just realized after your explanation that (as far as I understood it) in my example, the widestr variable never gets overwritten by the C function. It overwrites the reference to it but not the data itself.
It's very likely that the memory allocated by WideCString::from_str("test") is completely leaked, as nothing has a reference to that pointer after the function call.
Is this a general rule that a C (WinAPI) function will always allocate the buffer by itself (if not following the two step approach where it first returns the size)?
I don't believe there are any general rules between C APIs or even inside of a C API. Especially at a company as big as Microsoft with so much API surface. You need to read the documentation for each method. This is part of the constant drag that can make writing C feel like a slog.
it somehow feels odd for me to hand over uninitialized memory to such a function.
Yep, because there's not really a guarantee that the function initializes it. In fact, it would be wasteful to initialize it in case of failure, so it probably doesn't. It's another thing that Rust seems to have nicer solutions for.
Note that you shouldn't do function calls (e.g. println!) before calling things like last_os_error; those function calls might change the value of the last error!
1 Other Windows APIs actually require a multistep process - you call the function with NULL, it returns the number of bytes you need to allocate, then you call it again

Resources