Idiomatic Rust method for handling references to a buffer - reference

I would like to be able to construct objects that contain immutable references to a mutable buffer object. The following code does not work but illustrates my use case, is there an idiomatic Rust method for handling this?
#[derive(Debug)]
struct Parser<'a> {
buffer: &'a String
}
fn main() {
let mut source = String::from("Peter");
let buffer = &source;
let parser = Parser { buffer };
// How can I legally change source?
source.push_str(" Pan");
println!("{:?}", parser);
}

The golden rule of the rust borrow checker is: Only one writer OR multiple readers can access a resource at a time. This ensures that algorithms are safe to run in multiple threads.
You breach this rule here:
#[derive(Debug)]
struct Parser<'a> {
buffer: &'a String
}
fn main() {
// mutable access begins here
let mut source = String::from("Peter");
// immutable access begins here
let buffer = &source;
let parser = Parser { buffer };
source.push_str(" Pan");
println!("{:?}", parser);
// Both immutable and mutable access end here
}
If you are sure that your program doesn't actively access resources at the same time mutably and immutably, you can move the check from compile time to run time by wrapping your resource in a RefCell:
use std::cell::RefCell;
use std::rc::Rc;
#[derive(Debug)]
struct Parser {
buffer: Rc<RefCell<String>>
}
fn main() {
let source = Rc::new(RefCell::new(String::from("Peter")));
let parser = Parser { buffer: source.clone() };
source.borrow_mut().push_str(" Pan");
println!("{:?}", parser);
}
If you plan on passing your resource around threads, you can use an RwLock to block the thread until the resource is available:
use std::sync::{RwLock, Arc};
#[derive(Debug)]
struct Parser {
buffer: Arc<RwLock<String>>
}
fn main() {
let source = Arc::new(RwLock::new(String::from("Peter")));
let parser = Parser { buffer: source.clone() };
source.write().unwrap().push_str(" Pan");
println!("{:?}", parser);
}
On another note, you should prefer &str over &String

It's hard to tell what exactly you want to achieve by mutating the source; I would assume you don't want it to happen while the parser is doing its work? You can always try (depending on your specific use case) to separate the immutable from the mutable with an extra scope:
fn main() {
let mut source = String::from("Peter");
{
let buffer = &source;
let parser = Parser { buffer };
println!("{:?}", parser);
}
source.push_str(" Pan");
}
If you don't want to use RefCell, unsafe (or to simply keep a mutable reference to source in Parser and use that), I'm afraid it doesn't get better than plain refactoring.

To elaborate on how this can be done unsafely, what you've described can be achieved by using a raw const pointer to avoid the borrowing rules, which of course is inherently unsafe, as the very concept of what you've described is pretty unsafe. There are ways to make it safer though, should you choose this path. But I would probably default to using an Arc<RwLock> or Arc<Mutex> should safety be important.
use std::fmt::{self, Display};
#[derive(Debug)]
struct Parser {
buffer: *const String
}
impl Display for Parser {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let buffer = unsafe { &*self.buffer };
write!(f, "{}", buffer)
}
}
fn main() {
let mut source = String::from("Peter");
let buffer = &source as *const String;
let parser = Parser { buffer };
source.push_str(" Pan");
println!("{}", parser);
}

Related

Sharing a reference in multiple threads inside a function

I want to build a function that takes a HashMap reference as an argument. This HashMap should be shared between threads for read only access. The code example is very simple:
I insert some value into the HashMap, pass it to the function and want antoher thread to read that value. I get an Error that the borrowed value does not live long enough at line let exit_code = test(&m);. Why is this not working?
use std::thread;
use std::collections::HashMap;
use std::sync::{Arc, RwLock };
fn main(){
let mut m: HashMap<u32, f64> = HashMap::new();
m.insert(0, 0.1);
let exit_code = test(&m);
std::process::exit(exit_code);
}
fn test(m: &'static HashMap<u32, f64>) -> i32{
let map_lock = Arc::new(RwLock::new(m));
let read_thread = thread::spawn(move || {
if let Ok(r_guard) = map_lock.read(){
println!("{:?}", r_guard.get(&0).unwrap());
}
});
read_thread.join().unwrap();
return 0;
}
if I don't put the 'static in the function signature for the HashMap argument, Arc::new(RwLock::new(m)); doesn't work. How can I sovlve this problem?
A reference is not safe to share unless is 'static meaning that something will live for the extent of the program. Otherwise the compiler is not able to track the liveliness of the shared element.
You should wrap it outside of the function, and take ownership of an Arc:
use std::thread;
use std::collections::HashMap;
use std::sync::{Arc, RwLock };
fn main(){
let mut map = HashMap::new();
map.insert(0, 0.1);
let m = Arc::new(RwLock::new(map));
let exit_code = test(m);
std::process::exit(exit_code);
}
fn test(map_lock: Arc<RwLock<HashMap<u32, f64>>>) -> i32 {
let read_thread = thread::spawn(move || {
if let Ok(r_guard) = map_lock.read(){
println!("{:?}", r_guard.get(&0).unwrap());
}
});
read_thread.join().unwrap();
return 0;
}
Playground

How to know my pointer will not be deallocated?

I am working on a Linux-PAM module and I implemented this function to get the user.
pub fn get_user(pamh: PamHandleT) -> PamResult<&'static CStr> {
let mut raw_user: *const c_char = ptr::null();
let r = unsafe { pam_get_user(pamh, &mut raw_user, ptr::null()) };
if raw_user.is_null() {
Err(r)
} else {
let user = unsafe {CStr::from_ptr(raw_user)};
Ok(user)
}
}
pam_get_user is a C function from libpam that returns a * const c_char via its second argument. PAM documentation states that I must not free that pointer to allow interoperability with other modules.
By using the 'static lifetime for the return value, I believe this value will not be deallocated, is that correct? Maybe I could copy the value to use it in a more Rust-idiomatic way, how could I do that?
CStr is responsible for handling the value and in contrast to CString it does not allocate or deallocate memory, just like str and String. You just pass a pointer to it and have to ensure its requirements. Make sure to read std::ffi::CStr carefully and understand what you are doing.
Your code looks fine so far, so you should be ready to go.
You, probably, need to make owned value. This would allocate though.
pub fn get_user(pamh: PamHandleT) -> PamResult<CString> {
let mut raw_user: *const c_char = ptr::null();
let r = unsafe { pam_get_user(pamh, &mut raw_user, ptr::null()) };
if raw_user.is_null() {
Err(r)
} else {
let user = unsafe {CStr::from_ptr(raw_user)};
Ok(user.to_owned())
}
}
If you want to avoid allocation, you should create some context object.
"The pam_end function terminates the PAM transaction and is the last function an application should call in the PAM context. Upon return the handle pamh is no longer valid and all memory associated with it will be invalid. "
struct TransactionContext{
pamh: PamHandleT
}
impl Drop for TransactionContext{
fn drop(&mut self){
unsafe {pam_end(pamh);}
}
}
pub fn get_user(pamh: &TransactionContext) -> PamResult<&CStr> {
let mut raw_user: *const c_char = ptr::null();
let r = unsafe { pam_get_user(pamh.pamh, &mut raw_user, ptr::null()) };
if raw_user.is_null() {
Err(r)
} else {
let user = unsafe {CStr::from_ptr(raw_user)};
Ok(user)
}
}
This would make result CStr to have same lifetime as TransactionContext and borrow checker would ensure that you don't use result CStr after your TransactionContext is dropped.

How to idiomatically share data between closures with wasm-bindgen?

In my browser application, two closures access data stored in a Rc<RefCell<T>>. One closure mutably borrows the data, while the other immutably borrows it. The two closures are invoked independently of one another, and this will occasionally result in a BorrowError or BorrowMutError.
Here is my attempt at an MWE, though it uses a future to artificially inflate the likelihood of the error occurring:
use std::cell::RefCell;
use std::future::Future;
use std::pin::Pin;
use std::rc::Rc;
use std::task::{Context, Poll, Waker};
use wasm_bindgen::prelude::*;
use wasm_bindgen::JsValue;
#[wasm_bindgen]
extern "C" {
#[wasm_bindgen(js_namespace = console)]
pub fn log(s: &str);
#[wasm_bindgen(js_name = setTimeout)]
fn set_timeout(closure: &Closure<dyn FnMut()>, millis: u32) -> i32;
#[wasm_bindgen(js_name = setInterval)]
fn set_interval(closure: &Closure<dyn FnMut()>, millis: u32) -> i32;
}
pub struct Counter(u32);
#[wasm_bindgen(start)]
pub async fn main() -> Result<(), JsValue> {
console_error_panic_hook::set_once();
let counter = Rc::new(RefCell::new(Counter(0)));
let counter_clone = counter.clone();
let log_closure = Closure::wrap(Box::new(move || {
let c = counter_clone.borrow();
log(&c.0.to_string());
}) as Box<dyn FnMut()>);
set_interval(&log_closure, 1000);
log_closure.forget();
let counter_clone = counter.clone();
let increment_closure = Closure::wrap(Box::new(move || {
let counter_clone = counter_clone.clone();
wasm_bindgen_futures::spawn_local(async move {
let mut c = counter_clone.borrow_mut();
// In reality this future would be replaced by some other
// time-consuming operation manipulating the borrowed data
SleepFuture::new(5000).await;
c.0 += 1;
});
}) as Box<dyn FnMut()>);
set_timeout(&increment_closure, 3000);
increment_closure.forget();
Ok(())
}
struct SleepSharedState {
waker: Option<Waker>,
completed: bool,
closure: Option<Closure<dyn FnMut()>>,
}
struct SleepFuture {
shared_state: Rc<RefCell<SleepSharedState>>,
}
impl Future for SleepFuture {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let mut shared_state = self.shared_state.borrow_mut();
if shared_state.completed {
Poll::Ready(())
} else {
shared_state.waker = Some(cx.waker().clone());
Poll::Pending
}
}
}
impl SleepFuture {
fn new(duration: u32) -> Self {
let shared_state = Rc::new(RefCell::new(SleepSharedState {
waker: None,
completed: false,
closure: None,
}));
let state_clone = shared_state.clone();
let closure = Closure::wrap(Box::new(move || {
let mut state = state_clone.borrow_mut();
state.completed = true;
if let Some(waker) = state.waker.take() {
waker.wake();
}
}) as Box<dyn FnMut()>);
set_timeout(&closure, duration);
shared_state.borrow_mut().closure = Some(closure);
SleepFuture { shared_state }
}
}
panicked at 'already mutably borrowed: BorrowError'
The error makes sense, but how should I go about resolving it?
My current solution is to have the closures use try_borrow or try_borrow_mut, and if unsuccessful, use setTimeout for an arbitrary amount of time before attempting to borrow again.
Think about this problem independently of Rust's borrow semantics. You have a long-running operation that's updating some shared state.
How would you do it if you were using threads? You would put the shared state behind a lock. RefCell is like a lock except that you can't block on unlocking it — but you can emulate blocking by using some kind of message-passing to wake up the reader.
How would you do it if you were using pure JavaScript? You don't automatically have anything like RefCell, so either:
The state can be safely read while the operation is still ongoing (in a concurrency-not-parallelism sense): in this case, emulate that by not holding a single RefMut (result of borrow_mut()) alive across an await boundary.
The state is not safe to be read: you'd either write something lock-like as described above, or perhaps arrange so that it's only written once when the operation is done, and until then, the long-running operation has its own private state not shared with the rest of the application (so there can be no BorrowError conflicts).
Think about what your application actually needs and pick a suitable solution. Implementing any of these solutions will most likely involve having additional interior-mutable objects used for communication.

Using str and String interchangably

Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:
fn main() {
let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();
for t in v.iter_mut() {
if (t.contains("$world")) {
*t = &t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?
Rust has exactly what you want in form of a Cow (Clone On Write) type.
use std::borrow::Cow;
fn main() {
let mut v: Vec<_> = "Hello there $world!".split_whitespace()
.map(|s| Cow::Borrowed(s))
.collect();
for t in v.iter_mut() {
if t.contains("$world") {
*t.to_mut() = t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
as #sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use
*t = Cow::Owned(t.replace("$world", "Earth"));
In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.
let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
let p = pos + last_pos; // find always starts at last_pos
last_pos = pos + 5;
unsafe {
let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
s.remove(p); // remove $ sign
for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
*sc = c;
}
}
}
Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.
std::borrow::Cow, specifically used as Cow<'a, str>, where 'a is the lifetime of the string being parsed.
use std::borrow::Cow;
fn main() {
let mut v: Vec<Cow<'static, str>> = vec![];
v.push("oh hai".into());
v.push(format!("there, {}.", "Mark").into());
println!("{:?}", v);
}
Produces:
["oh hai", "there, Mark."]

Rust MemWriter return pointer to Buffer

I'd like to have a function use a MemWriter to write some bytes and then return a pointer to the buffer. I'm struggling to understand how to use lifetimes in this case. How would I make the below code work and what should I read to fill my knowledge gap here?
struct Request<T: Encodable> {
id: i16,
e: T
}
impl <T: Encodable> Request<T> {
fn serialize<'s>(&'s self) -> io::IoResult<&'s Vec<u8>> {
let mut writer = io::MemWriter::new();
try!(writer.write_be_i16(0 as i16));
let buf = writer.unwrap();
let size = buf.len();
let result: io::IoResult<&Vec<u8>> = Ok(&buf);
result
}
}
You can't return a reference to a buffer that is stored nowhere
You need to store your buffer internally, or you would try to return a reference to freed memory, which is dangerous and thus forbidden by the lifetime checker.
For example like this :
struct Request<T: Encodable> {
buf: Vec<u8>
}
impl <T: Encodable> Request<T> {
fn serialize<'s>(&'s mut self) -> io::IoResult<&'s Vec<u8>> { //'
let mut writer = io::MemWriter::new();
try!(writer.write_be_i16(0 as i16));
self.buf = writer.unwrap();
let size = self.buf.len();
let result: io::IoResult<&Vec<u8>> = Ok(&self.buf);
result
}
}
Or, as Vladimir Matveev pointed out in the comments, you can simply return the Vec. Vec is already a container safely managing memory on the heap, returning it directly should be good for you in most situations, and this way you avoid any lifetime issues.
impl <T: Encodable> Request<T> {
fn serialize(&mut self) -> io::IoResult<Vec<u8>> {
let mut writer = io::MemWriter::new();
try!(writer.write_be_i16(0 as i16));
let buf = writer.unwrap();
let size = buf.len();
Ok(buf)
}
}

Resources