I'm trying to design a struct to carry around a Postgres connection, transaction, and a bunch of prepared statements, and then execute the prepared statements repeatedly. But I'm running into lifetime problems. Here is what I've got:
extern crate postgres;
use postgres::{Connection, TlsMode};
use postgres::transaction::Transaction;
use postgres::stmt::Statement;
pub struct Db<'a> {
conn: Connection,
tx: Transaction<'a>,
insert_user: Statement<'a>,
}
fn make_db(url: &str) -> Db {
let conn = Connection::connect(url, TlsMode::None).unwrap();
let tx = conn.transaction().unwrap();
let insert_user = tx.prepare("INSERT INTO users VALUES ($1)").unwrap();
Db {
conn: conn,
tx: tx,
insert_user: insert_user,
}
}
pub fn main() {
let db = make_db("postgres://paul#localhost/t");
for u in &["foo", "bar"] {
db.insert_user.execute(&[&u]);
}
db.tx.commit().unwrap();
}
Here is the error I'm getting (on Rust 1.15.0 stable):
error: `conn` does not live long enough
--> src/main.rs:15:14
|
15 | let tx = conn.transaction().unwrap();
| ^^^^ does not live long enough
...
22 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the body at 13:28...
--> src/main.rs:13:29
|
13 | fn make_db(url: &str) -> Db {
| ^
I've read the Rust book (I've lost count how many times), but I'm not sure how to make progress here. Any suggestions?
EDIT: Thinking about this some more I still don't understand why in principle I can't tell Rust, "conn lives as long as Db does". The issue is with moving conn, but what if I don't move it? I understand why in C you can't return a pointer to stack-allocated memory, e.g.:
#include <stdio.h>
int *build_array() {
int ar[] = {1,2,3};
return ar;
}
int main() {
int *ar = build_array();
printf("%d\n", ar[1]);
}
And I get how that is similar to in Rust returning a &str or returning a vec slice.
But in Rust you can do this:
#[derive(Debug)]
struct S {
ar: Vec<i32>,
}
fn build_array() -> S {
let v = vec![1, 2, 3];
S { ar: v }
}
fn main() {
let s = build_array();
println!("{:?}", s);
}
And my understanding is that Rust is smart enough so that returning S doesn't actually require a move; essentially it is going straight to the caller's stack frame.
So I don't understand why it can't also put Db (including conn) in the caller's stack frame. Then no moves would be required, and tx would never hold an invalid address. I feel like Rust should be able to figure that out. I tried adding a lifetime hint, like this:
pub struct Db<'a> {
conn: Connection<'a>,
tx: Transaction<'a>,
insert_user: Statement<'a>,
}
But that gives an "unexpected lifetime parameter" error. I can accept that Rust can't follow the logic, but I'm curious if there is a reason why in principle it couldn't.
It does seem that putting conn on the heap should solve my problems, but I can't get this to work either:
pub struct Db<'a> {
conn: Box<Connection>,
tx: Transaction<'a>,
insert_user: Statement<'a>,
}
Even with a let conn = Box::new(Connection::connect(...));, Rust still tells me "conn does not live long enough". Is there some way to make this work with Box, or is that a dead end?
EDIT 2: I tried doing this with macros also, to avoid any extra stack frames:
extern crate postgres;
use postgres::{Connection, TlsMode};
use postgres::transaction::Transaction;
use postgres::stmt::Statement;
pub struct Db<'a> {
conn: Connection,
tx: Transaction<'a>,
insert_user: Statement<'a>,
}
macro_rules! make_db {
( $x:expr ) => {
{
let conn = Connection::connect($x, TlsMode::None).unwrap();
let tx = conn.transaction().unwrap();
let insert_user = tx.prepare("INSERT INTO users VALUES ($1)").unwrap();
Db {
conn: conn,
tx: tx,
insert_user: insert_user,
}
}
}
}
pub fn main() {
let db = make_db!("postgres://paul#localhost/t");
for u in &["foo", "bar"] {
db.insert_user.execute(&[&u]);
}
db.tx.commit().unwrap();
}
But that still tells me that conn does not live long enough. It seems that moving it into the struct should really not require any real RAM changes, but Rust still won't let me do it.
Starting with this function:
fn make_db(url: &str) -> Db {
unimplemented!()
}
Due to lifetime elision, this is equivalent to:
fn make_db<'a>(url: &'a str) -> Db<'a> {
unimplemented!()
}
That is, the lifetimes of all the references inside the Db struct must live as long as the string slice passed in. That only makes sense if the struct is holding on to the string slice.
To "solve" that, we can try to separate the lifetimes:
fn make_db<'a, 'b>(url: &'a str) -> Db<'b> {
unimplemented!()
}
Now this makes even less sense because now we are just making up a lifetime. Where is that 'b coming from? What happens if the caller of make_db decides that the concrete lifetime for the generic lifetime parameter 'b should be 'static? This is further explained in Why can't I store a value and a reference to that value in the same struct?, search for "something is really wrong with our creation function".
We also see the part of the question with "Sometimes, I'm not even taking a reference of the value" in the other question, which says in the answer:
the Child instance contains a reference to the Parent that created it,
If we check out the definition for Connection::transaction:
fn transaction<'a>(&'a self) -> Result<Transaction<'a>>
or the definition if you don't believe the docs:
pub struct Transaction<'conn> {
conn: &'conn Connection,
depth: u32,
savepoint_name: Option<String>,
commit: Cell<bool>,
finished: bool,
}
Yup, a Transaction keeps a reference to its parent Connection. Now that we see that Transaction has a reference to Connection we can return to the other question to see how to solve the problem: split apart the structs so that the nesting mirrors the lifetimes.
This was a very long-winded way of saying: no, you cannot create a single structure that contains a database and a transaction of that database due to the implementation of the postgres crate. Presumably the crate is implemented in this fashion for maximum performance.
I don't see why [returning Db<'b>] makes less sense. Normally when a function returns a thing, the thing lives as long as it is assigned to something. Why can't -> Db work the same way?
The entire point of references is that you don't own the referred-to value. You return Db and the caller of make_db would own that, but what owns the thing that Db is referring to? Where did it come from? You cannot return a reference to something local as that would violate all of Rust's safety rules. If you want to transfer ownership, you just do that.
See also
Is there any way to return a reference to a variable created in a function?
Return local String as a slice (&str)
Using the other answer, I put together working code that lets me bundle up the transaction and all the prepared statements, and pass them around together:
extern crate postgres;
use postgres::{Connection, TlsMode};
use postgres::transaction::Transaction;
use postgres::stmt::Statement;
pub struct Db<'a> {
tx: Transaction<'a>,
insert_user: Statement<'a>,
}
fn make_db(conn: &Connection) -> Db {
let tx = conn.transaction().unwrap();
let insert_user = tx.prepare("INSERT INTO users VALUES ($1)").unwrap();
Db {
tx: tx,
insert_user: insert_user,
}
}
pub fn main() {
let conn = Connection::connect("postgres://paul#localhost/t", TlsMode::None).unwrap();
let db = make_db(&conn);
for u in &["foo", "bar"] {
db.insert_user.execute(&[&u]);
}
db.tx.commit().unwrap();
}
As I understand it, Rust wants to guarantee that conn lives as long as db, so by keeping conn outside of the "constructor", the lexical structure ensures that it won't get removed too early.
My struct still doesn't encapsulate conn, which seems too bad to me, but at least it lets me keep everything else together.
Related
In module db.rs , while reading the value from the DataBase, i got the Error temporary value dropped while borrowed
consider using a let binding to create a longer lived value
use bytes::Bytes;
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
#[derive(Clone, Debug)]
pub struct Db {
// pub entries: Arc<bool>,
pub entries: Arc<Mutex<HashMap<String, Bytes>>>,
}
impl Db {
pub fn new() -> Db {
Db {
entries: Arc::new(Mutex::new(HashMap::new())),
}
}
/// Reads data from the database
pub fn read(&mut self, arr: &[String]) -> Result<Bytes, &'static str> {
let key = &arr[1];
let query_result = self.entries.lock().unwrap().get(key);// Error in this Line.
if let Some(value) = query_result {
return Ok(Bytes::from("hello"));
} else {
return Err("no such key found");
}
}
}
but when i modify the code and trying to get the value in the next line, it didn't give any error.
let query_result = self.entries.lock().unwrap();
let result = query_result.get(key);
can anyone help me understand what's going on under the hood?
We can see why Rust thinks this is an error by checking how Mutex::lock works. If successful, it doesn't return a reference directly, it returns a MutexGuard struct that can deref into the type it wraps, a HashMap in your case.
The signature of Deref::deref<Target = T> is (with the elided lifetimes added):
fn deref<'a>(&'a self) -> &'a T
This means that the MutexGuard can only give us a reference to the HashMap inside for as long as it is itself alive (the lifetime 'a). But because you never store it anywhere, instead dereferencing it directly, Rust thinks that it gets dropped right after the call to get. But you keep the result of get around, which can only live for as long as the reference to the HashMap passed into it, which in turn only lives as long as the MutexGuard which gets dropped immediately.
If you store the MutexGuard, on the other hand, like
let guard = self.entries.lock().unwrap();
let query_result = guard.get(key);
it only gets dropped at the end of the scope, so any references it gave out are also valid until the end of the scope.
I have two structs:
Client, which stores a callback and calls it in response to receiving new data. As an example, you can think of this as a websocket client, and we want to provide a hook for incoming messages.
BusinessLogic, which wants to hold a Client initialized with a callback that will update its local value in response to changes that the Client sees.
After following compiler hints, I arrived at the following minimal example:
Rust playground link
use rand::Rng;
struct Client<'cb> {
callback: Box<dyn FnMut(i64) + 'cb>,
}
impl<'cb> Client<'cb> {
fn do_thing(&mut self) {
// does stuff
let value = self._get_new_value();
// does more stuff
(self.callback)(value);
// does even more stuff
}
fn _get_new_value(&self) -> i64 {
let mut rng = rand::thread_rng();
rng.gen()
}
}
struct BusinessLogic<'cb> {
value: Option<i64>,
client: Option<Client<'cb>>,
}
impl<'cb> BusinessLogic<'cb> {
fn new() -> Self {
Self {
value: None,
client: None,
}
}
fn subscribe(&'cb mut self) {
self.client = Some(Client {
callback: Box::new(|value| {
self.value = Some(value);
})
})
}
}
fn main() {
let mut bl = BusinessLogic::new();
bl.subscribe();
println!("Hello, world!");
}
Problem is, I am still getting the following compiler error:
Compiling playground v0.0.1 (/playground)
error[E0597]: `bl` does not live long enough
--> src/main.rs:51:5
|
51 | bl.subscribe();
| ^^^^^^^^^^^^^^ borrowed value does not live long enough
...
54 | }
| -
| |
| `bl` dropped here while still borrowed
| borrow might be used here, when `bl` is dropped and runs the destructor for type `BusinessLogic<'_>`
For more information about this error, try `rustc --explain E0597`.
error: could not compile `playground` due to previous error
I understand why I'm seeing this error: the call to subscribe uses a borrow of bl with a lifetime of 'cb, which is not necessarily contained within the scope of main(). However, I don't see how to resolve this issue. Won't I always need to provide a lifetime for the callback stored in Client, which will end up bleeding through my code in the form of 'cb lifetime annotations?
More generally, I'm interested in understanding what is the canonical way of solving this callback/hook problem in Rust. I'm open to designs different from the one I have proposed, and if there are relevant performance concerns for various options, that would be useful to know also.
What you've created is a self-referential structure, which is problematic and not really expressible with references and lifetime annotations. See: Why can't I store a value and a reference to that value in the same struct? for the potential problems and workarounds. Its an issue here because you want to be able to mutate the BusinessLogic in the callback, but since it holds the Client, you can mutate the callback while its running, which is no good.
I would instead suggest that the callback has full ownership of the BusinessLogic which does not directly reference the Client:
use rand::Rng;
struct Client {
callback: Box<dyn FnMut(i64)>,
}
impl Client {
fn do_thing(&mut self) {
let value = rand::thread_rng().gen();
(self.callback)(value);
}
}
struct BusinessLogic {
value: Option<i64>,
}
fn main() {
let mut bl = BusinessLogic {
value: None
};
let mut client = Client {
callback: Box::new(move |value| {
bl.value = Some(value);
})
};
client.do_thing();
println!("Hello, world!");
}
if you need the subscriber to have backwards communication to the Client, you can pass an additional parameter that the callback can mutate, or simply do it via return value
if you need more complicated communication from the Client to the callback, either send a Message enum as the argument, or make the callback a custom trait instead of just FnMut with additional methods
if you need a single BusinessLogic to operate from multiple Clients use Arc+Mutex to allow shared ownership
I need a static array of structs and the structs contain a Vec. I can manage the lifetimes of the actual values. I get the following error:
: Mar23 ; cargo test
Compiling smalltalk v0.1.0 (/Users/dmason/git/AST-Smalltalk/rust)
error[E0507]: cannot move out of `dispatchTable[_]` as `dispatchTable` is a static item
--> src/minimal.rs:32:44
|
30 | let old = ManuallyDrop::into_inner(dispatchTable[pos]);
| ^^^^^^^^^^^^^^^^^^ move occurs because `dispatchTable[_]` has type `ManuallyDrop<Option<Box<Dispatch>>>`, which does not implement the `Copy` trait
error: aborting due to previous error
Here is a minimal compilable example:
#[derive(Copy, Clone)]
struct MethodMatch {
hash: i64,
method: Option<bool>,
}
#[derive(Clone)]
pub struct Dispatch {
class: i64,
table: Vec<MethodMatch>,
}
const max_classes : usize = 100;
use std::mem::ManuallyDrop;
const no_dispatch : ManuallyDrop<Option<Box<Dispatch>>> = ManuallyDrop::new(None);
static mut dispatchTable : [ManuallyDrop<Option<Box<Dispatch>>>;max_classes] = [no_dispatch;max_classes];
use std::sync::RwLock;
lazy_static! {
static ref dispatchFree : RwLock<usize> = {RwLock::new(0)};
}
pub fn addClass(c : i64, n : usize) {
let mut index = dispatchFree.write().unwrap();
let pos = *index;
*index += 1;
replaceDispatch(pos,c,n);
}
pub fn replaceDispatch(pos : usize, c : i64, n : usize) -> Option<Box<Dispatch>> {
let mut table = Vec::with_capacity(n);
table.resize(n,MethodMatch{hash:0,method:None});
unsafe {
let old = ManuallyDrop::into_inner(dispatchTable[pos]);
dispatchTable[pos]=ManuallyDrop::new(Some(Box::new(Dispatch{class:c,table:table})));
old
}
}
The idea I had was to have replaceDispatch create a new Dispatch option object, and replace the current value in the array with the new one, returning the original, with the idea that the caller will get the Dispatch option value and be able to use and then drop/deallocate the object.
I found that it will compile if I add .clone() right after the identified error point. But then the original value never gets dropped, so (the into_inner is redundant and) I'm creating a memory leak!. Do I have to manually drop it (if I could figure out how)? I thought that's what ManuallyDrop bought me. In theory, if I created a copy of the fields from the Vec into a copy, that would point to the old data, so when that object got dropped, the memory would get freed. But (a) that seems very dirty, (b) it's a bit of ugly, unnecessary code (I have to handle the Some/None cases, look inside the Vec, etc.), and (c) I can't see how I'd even do it!!!!
As the compiler tells you, you cannot move a value out of a place observable by others. But since you have the replacement at the ready, you can use std::mem::replace:
pub fn replaceDispatch(pos: usize, c: i64, n: usize) -> Option<Box<Dispatch>> {
... table handling omitted ...
unsafe {
let old = std::mem::replace(
&mut dispatchTable[pos],
ManuallyDrop::new(Some(Box::new(Dispatch {
class: c,
table: table,
}))),
);
ManuallyDrop::into_inner(old)
}
}
Playground
In fact, since you're using the Option to manage the lifetime of Dispatch, you don't need ManuallyDrop at all, and you also don't need the Box: playground.
As part of binding a C API to Rust, I have a mutable reference ph: &mut Ph, a struct struct EnsureValidContext<'a> { ph: &'a mut Ph }, and some methods:
impl Ph {
pub fn print(&mut self, s: &str) {
/*...*/
}
pub fn with_context<F, R>(&mut self, ctx: &Context, f: F) -> Result<R, InvalidContextError>
where
F: Fn(EnsureValidContext) -> R,
{
/*...*/
}
/* some others */
}
impl<'a> EnsureValidContext<'a> {
pub fn print(&mut self, s: &str) {
self.ph.print(s)
}
pub fn close(self) {}
/* some others */
}
I don't control these. I can only use these.
Now, the closure API is nice if you want the compiler to force you to think about performance (and the tradeoffs you have to make between performance and the behaviour you want. Context validation is expensive). However, let's say you just don't care about that and want it to just work.
I was thinking of making a wrapper that handles it for you:
enum ValidPh<'a> {
Ph(&'a mut Ph),
Valid(*mut Ph, EnsureValidContext<'a>),
Poisoned,
}
impl<'a> ValidPh<'a> {
pub fn print(&mut self) {
/* whatever the case, just call .print() on the inner object */
}
pub fn set_context(&mut self, ctx: &Context) {
/*...*/
}
pub fn close(&mut self) {
/*...*/
}
/* some others */
}
This would work by, whenever necessary, checking if we're a Ph or a Valid, and if we're a Ph we'd upgrade to a Valid by going:
fn upgrade(&mut self) {
if let Ph(_) = self { // don't call mem::replace unless we need to
if let Ph(ph) = mem::replace(self, Poisoned) {
let ptr = ph as *mut _;
let evc = ph.with_context(ph.get_context(), |evc| evc);
*self = Valid(ptr, evc);
}
}
}
Downgrading is different for each method, as it has to call the target method, but here's an example close:
pub fn close(&mut self) {
if let Valid(_, _) = self {
/* ok */
} else {
self.upgrade()
}
if let Valid(ptr, evc) = mem::replace(self, Invalid) {
evc.close(); // consume the evc, dropping the borrow.
// we can now use our original borrow, but since we don't have it anymore, bring it back using our trusty ptr
*self = unsafe { Ph(&mut *ptr) };
} else {
// this can only happen due to a bug in our code
unreachable!();
}
}
You get to use a ValidPh like:
/* given a &mut vph */
vph.print("hello world!");
if vph.set_context(ctx) {
vph.print("closing existing context");
vph.close();
}
vph.print("opening new context");
vph.open("context_name");
vph.print("printing in new context");
Without vph, you'd have to juggle &mut Ph and EnsureValidContext around on your own. While the Rust compiler makes this trivial (just follow the errors), you may want to let the library handle it automatically for you. Otherwise you might end up just calling the very expensive with_context for every operation, regardless of whether the operation can invalidate the context or not.
Note that this code is rough pseudocode. I haven't compiled or tested it yet.
One might argue I need an UnsafeCell or a RefCell or some other Cell. However, from reading this it appears UnsafeCell is only a lang item because of interior mutability — it's only necessary if you're mutating state through an &T, while in this case I have &mut T all the way.
However, my reading may be flawed. Does this code invoke UB?
(Full code of Ph and EnsureValidContext, including FFI bits, available here.)
Taking a step back, the guarantees upheld by Rust are:
&T is a reference to T which is potentially aliased,
&mut T is a reference to T which is guaranteed not to be aliased.
The crux of the question therefore is: what does guaranteed not to be aliased means?
Let's consider a safe Rust sample:
struct Foo(u32);
impl Foo {
fn foo(&mut self) { self.bar(); }
fn bar(&mut self) { *self.0 += 1; }
}
fn main() { Foo(0).foo(); }
If we take a peek at the stack when Foo::bar is being executed, we'll see at least two pointers to Foo: one in bar and one in foo, and there may be further copies on the stack or in other registers.
So, clearly, there are aliases in existence. How come! It's guaranteed NOT to be aliased!
Take a deep breath: how many of those aliases can you access at the time?
Only 1. The guarantee of no aliasing is not spatial but temporal.
I would think, therefore, that at any point in time, if a &mut T is accessible, then no other reference to this instance must be accessible.
Having a raw pointer (*mut T) is perfectly fine, it requires unsafe to access; however forming a second reference may or may not be safe, even without using it, so I would avoid it.
Rust's memory model is not rigorously defined yet, so it's hard to say for sure, but I believe it's not undefined behavior to:
carry a *mut Ph around while a &'a mut Ph is also reachable from another path, so long as you don't dereference the *mut Ph, even just for reading, and don't convert it to a &Ph or &mut Ph, because mutable references grant exclusive access to the pointee.
cast the *mut Ph back to a &'a mut Ph once the other &'a mut Ph falls out of scope.
I'm wrapping a C API which allows the caller to set/get an arbitrary pointer via function calls. In this way, the C API allows a caller to associate arbitrary data with one of the C API objects. This data is not used in any callbacks, it's just a pointer that a user can stash away and get at later.
My wrapper struct implements the Drop trait for the C object that contains this pointer. What I'd like to be able to do, but am not sure it's possible, is have the data dropped correctly if the pointer is not null when the wrapper struct drops. I'm not sure how I would recover the correct type though from a raw c_void pointer.
Two alternatives I'm thinking of are
Implement the behavior of these two calls in the wrapper. Don't make any calls to the C API.
Don't attempt to offer any kind of safer interface to these functions. Document that the pointer must be managed by the caller of the wrapper.
Is what I want to do possible? If not, is there a generally accepted practice for these kinds of situations?
A naive + fully automatic approach is NOT possible for the following reasons:
freeing memory does not call drop/deconstructors/...: the C API can be used from languages which can have objects which should be deconstructed properly, e.g. C++ or Rust itself. So when you only store a memory pointer you do not know you to call the proper function (you neither know which function not how the calling conventions look like).
which memory allocator?: memory allocation and deallocation isn't a trivial thing. your program needs to request memory from the OS and then manage this resources in an intelligent way to be efficient and correct. This is usually done by a library. In case of Rust, jemalloc is used (but can be changed). So even when you ask the API caller to only pass Plain Old Data (which should be easier to destruct) you still don't know which library function to call to deallocate memory. Just using libc::free won't work (it can but it could horrible fail).
Solutions:
dealloc callback: you can ask the API user to set an additional pointer to, let's say a void destruct(void* ptr) function. If this one is not NULL, you call that function during your drop. You could also use int as an return type to signal when the destruction went wrong. In that case you could for example panic!.
global callback: let's assume you requested your user to only pass POD (plain old data). To know which free function of the memory allocator to call, you could request the user to register a global void (*free)(void* ptr) pointer which is called during drop. You could also make that one optional.
Although I was able to follow the advice in this thread, I wasn't entirely satisfied with my results, so I asked the question on the Rust forums and found the answer I was really looking for. (play)
use std::any::Any;
static mut foreign_ptr: *mut () = 0 as *mut ();
unsafe fn api_set_fp(ptr: *mut ()) {
foreign_ptr = ptr;
}
unsafe fn api_get_fp() -> *mut() {
foreign_ptr
}
struct ApiWrapper {}
impl ApiWrapper {
fn set_foreign<T: Any>(&mut self, value: Box<T>) {
self.free_foreign();
unsafe {
let raw = Box::into_raw(Box::new(value as Box<Any>));
api_set_fp(raw as *mut ());
}
}
fn get_foreign_ref<T: Any>(&self) -> Option<&T> {
unsafe {
let raw = api_get_fp() as *const Box<Any>;
if !raw.is_null() {
let b: &Box<Any> = &*raw;
b.downcast_ref()
} else {
None
}
}
}
fn get_foreign_mut<T: Any>(&mut self) -> Option<&mut T> {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
let b: &mut Box<Any> = &mut *raw;
b.downcast_mut()
} else {
None
}
}
}
fn free_foreign(&mut self) {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
Box::from_raw(raw);
}
}
}
}
impl Drop for ApiWrapper {
fn drop(&mut self) {
self.free_foreign();
}
}
struct MyData {
i: i32,
}
impl Drop for MyData {
fn drop(&mut self) {
println!("Dropping MyData with value {}", self.i);
}
}
fn main() {
let p1 = Box::new(MyData {i: 1});
let mut api = ApiWrapper{};
api.set_foreign(p1);
{
let p2 = api.get_foreign_ref::<MyData>().unwrap();
println!("i is {}", p2.i);
}
api.set_foreign(Box::new("Hello!"));
{
let p3 = api.get_foreign_ref::<&'static str>().unwrap();
println!("payload is {}", p3);
}
}