Build object with specific lifetime - reference

I have a problem that I don't know exactly how to model into Rust, in regard with ownership, lifetime and all that.
I have a large struct:
struct LargeStruct {
x: f32,
n: i32,
// lot of other fields.
}
The struct being large, I want to avoid copies, and therefore I carry around pointers to it.
I have a foo function that takes a pointer to LargeStruct and a f32 value. If this value is larger than the field x, I want to create a new object with x set to this value, and returns it. If it is not larger, then I want to return the original pointer itself.
I naively implemented it like this:
fn foo(object: &LargeStruct, y: f32) -> &LargeStruct {
if object.x < y {
LargeStruct {
x: y,
n: object.n,
// ...
}
}
else {
object
}
}
But it does not work: the two branches of the if do no return the same type. In the first case, I actually return a LargeStruct, and in the second I return a &LargeStruct. If I modify the object construction to take its pointer:
&LargeStruct {
then it doesn't work either: the lifetime of the object constructed is too short, so I can not return that from the function.
If I try to build the object on the heap:
~LargeStruct {
I have now another compilation error:
if and else have incompatible types: expected ~LargeStruct but found
&LargeStruct (expected &-ptr but found ~-ptr)
I tried to specify a lifetime in the function signature:
fn foo<'a>(object: &'a LargeStruct, y: f32) -> &'a LargeStruct {
But it does not help: I don't know how to build a new LargeStruct with the same lifetime.
I am calling this function like this:
fn main() {
let object = LargeStruct{
x: 1.0,
n: 2,
// ...
};
let result = foo(&object, 2.0);
}

The intent behind your approach is to only return a modified copy under certain conditions if I understand it right. In Rust this can be modeled with a function that returns an Option<~LargeStruct> type (perhaps even with Option<LargeStruct> but I'm not sure if the compiler can efficiently move large objects in this case).
fn foo(object: &LargeStruct, y: f32) -> Option<~LargeStruct> {
if object.x < y {
return Some(~LargeStruct {
x: y,
//n: object.n,
// ...
})
}
None
}
As for why your approach didn't work: Rust doesn't let you return a reference to an object that will be freed once the function returns. A lifetime is a way to say that an object must live at least as long as the references to it.

The answer to the question as asked
It is not possible to design something in this way with it completely transparent; a reference must always have a lifetime not in excess of the scope of its owner; it is thus impossible to return a reference to an object whose scope does not exceed the function.
This can be solved in a way with an enum specifying the return type as either a reference to a LargeStruct or a LargeStruct, but it's clumsy:
pub struct LargeStruct {
a: int,
}
pub enum LargeStructOrRef<'a> {
LargeStructRef(&'a LargeStruct),
LargeStruct(LargeStruct),
}
fn foo<'a>(object: &'a LargeStruct, y: f32) -> LargeStructOrRef<'a> {
if object.x < y {
LargeStruct(LargeStruct {
x: y,
n: object.n,
// ...
})
} else {
LargeStructRef(object)
}
}
You'd then need to do pattern matching between LargeStruct and LargeStructRef—it can't be made transparent to the caller.
Alternative designs
You can probably design this in a different way which will resolve these difficulties. For example, you might make foo take a &mut LargeStruct and modify the struct rather than creating a new one if object.x < y. (This is most likely to be the design you actually want, I think.)

Related

Rust impl default trait with private fields

I'm getting an error when I have this sort of setup:
default_test.rs:
mod default_mod;
use default_mod::Point;
fn main() {
let _p1 = Point::new();
let _p2: Point = Point {
z: 1,
..Default::default()
};
}
default_mod.rs:
pub struct Point {
x: i32,
y: i32,
pub z: i32,
}
impl Point {
pub fn new() -> Self {
Point { x: 0, y: 0, z: 0 }
}
}
impl Default for Point {
fn default() -> Self {
Point { x: 0, y: 0, z: 0 }
}
}
which gives the compiler error:
default_test.rs:9:7
|
9 | ..Default::default()
| ^^^^^^^^^^^^^^^^^^ field `x` is private
error[E0451]: field `y` of struct `default_mod::Point` is private
Short version - I have a struct with both public and private fields. I would like to initialise this struct with default values, but sometimes override them.
I can't seem to fix this error, nor seen anything on the internet or the docs that even mentions errors like this.
It surprises me, because I'd think a common use-case would be to initialise a struct, and that some members of the struct would be private so you can hide implementation details behind and interface.
In my case the private field is a Vec as I have some logic that needs to go into adding or removing things from the vector, so I want to make it private to prevent anyone messing up the data structure.
What are my options here?
It surprises me, because I'd think a common use-case would be to initialise a struct, and that some members of the struct would be private so you can hide implementation details behind and interface.
The problem is that the struct update syntax doesn't do what you think it does. For example, the book shows the following code:
let user2 = User {
email: String::from("another#example.com"),
username: String::from("anotherusername567"),
..user1
};
The ..user1 syntax fills in the User fields we haven't explicitly specified, such as active: user1.active, signin_count: user1.signin_count. .. may be followed by an arbitrary expression which returns the structure, which is where Default::default() comes into play, and means the same as User::default() because a User is expected. However, the desugaring remains unchanged and boils down to assigning individual fields, in neither case granting special access to private fields.
To get back to your example, this code:
let p = Point {
z: 1,
..Default::default()
};
is syntactic sugar for:
let p = {
let _tmp = Point::default();
Point {
x: _tmp.x,
y: _tmp.y,
z: 1,
}
};
and not for the expected:
// NOT what happens
let p = {
let _tmp = Point::default();
p.z = 1;
_tmp
};
What are my options here?
The most idiomatic option is to provide a builder for Point. That is also somewhat bulky1, so if you're looking for a simple solution, you could also use Point::default() and set the z attribute manually. The struct update syntax is incompatible with structs with private fields and just not useful for your type.
1
Though there are crates like derive_builder, typed-builder and builder-pattern that take some of the drudgery away.
What are my options here?
A new() with parameters or a builder.
..struct is just a convenient way of doing functional updates, it doesn't bypass ACLs. Here since your struct has private fields, users can not manipulate it as a "bare" struct, they have to treat it as a largely opaque type.

How to return a reference to a method and the struct it relates to

I have a function that will create one of several structs (all of which have a method of the same signature but have other, different, methods and traits); I would like to instance one of the structs in a function and return a reference to its method that can be called elsewhere.
// Pseudocode
type SizeGetter = fn()-> isize;
fn get_ref()-> &SizeGetter{
let catch = Fish{weight: 12};
&catch.get_weight()
//Fish.get_weight() is used here but it may be any struct.method() -> isize
}
fn main() {
let getit = get_ref();
println!("{}", getit());
}
In the above my goal is to define catch.getweight() in a function, return a reference to that function and then call it later to get the size.
That original attempt could not work because you cannot return a reference to something created in a function. In this case, returning something equivalent to a method to a locally created struct value requires the value to outlive the function's lifetime as well.
We can reference a method bar in a struct Foo with Foo::bar, but this one isn't bound to a receiver value. There is no syntax specifically for referencing a method call on a value. The solution instead is to create a closure that captures the value and calls the method.
let foo = Foo::new();
move || foo.bar()
Considering this Fish struct and implementation (adjusted to comply with naming conventions):
struct Fish {
weight: usize,
}
impl Fish {
fn weight(&self) -> usize {
self.weight
}
}
A function returning another self-sufficient function would be written like so:
fn fish_weight() -> impl Fn() -> usize {
let r#catch = Fish { weight: 12 };
move || r#catch.weight()
}
Using:
let get = fish_weight();
println!("Fish weight: {}", get());
Playground

How to sort a Vec of structs by a String field?

I'm having difficulty with a seemingly trivial sort by String field. Repro below:
struct Dummy {
x: String,
y: i8
}
fn main() {
let mut dummies: Vec<Dummy> = Vec::new();
dummies.push(Dummy { x: "a".to_string(), y: 1 });
dummies.push(Dummy { x: "b".to_string(), y: 2 });
dummies.sort_by_key(|d| d.x); // error[E0507]: cannot move out of borrowed content
dummies.sort_by_key(|d| d.y); // This is fine
}
Can someone please explain what exactly is going wrong and how to fix it?
First, let's look at your original error message, then we'll go through a few fixes and try to understand everything.
In the closure that you use in dummies.sort_by_key(|d| d.x);, d is a reference to a Dummy instance. However, the field access d.x is the String itself. If you wanted to return that String, you'd have to give ownership of it to whatever called the closure. But since d was just a reference, you can't pass ownership of its data.
One easy fix is to simply clone the string as dummies.sort_by_key(|d| d.x.clone());. This makes a copy of the string before returning it in the closure (this is Andra's solution). This works perfectly, but if performance or memory use is an issue, we can avoid the clone.
The idea here is that using the string as the key is wasteful. Really, all we need to know is which of two strings is smaller. If we use the string as a key, then every time the sort function needs to compare two Dummys, it calls the key function on each one and the strings are passed to a (very short) function that simply compares them. If we did the comparison in the same context as the borrow, we'd be able to simply pass the result of the comparison on, rather than the strings.
The solution is the sort_by method on slices. This allows us to take references to two Dummys and decide if one is smaller than the other. For example we can use it like dummies.sort_by(|d1, d2| d1.x.cmp(&d2.x)); (full example here)
Addendum
Why can't we use sort_by_key without cloning the Strings? Surely there must be some clever way of using string slices and lifetimes to do it.
Let's look at the signature of the sort_by_key function.
pub fn sort_by_key<K, F>(&mut self, f: F) where
F: FnMut(&T) -> K,
K: Ord,
The interesting part of this function is not what is there, but what isn't there. The type parameter K doesn't depend on the lifetime of the reference passed to f.
As the slice is sorted, the key function gets repeatedly called with a reference to a Dummy instance. Since the slice is sorted a little between each call, the lifetime of the reference must be very short. If it were longer, it'd get invalidated the next time the elements of the slice were moved around. However, K can't depend on that lifetime. That means that whatever our key function is, it can't return anything that depends on the current location of the Dummy (e.g. a string slice, a reference, or any other clever construction1).
However, we could make K depend on the lifetime of whatever is passed to it. The idea here is what's called Higher-Rank Trait Bounds. These currently only work with lifetimes (though in theory they could be extended to all type parameters). We could posit another slice method with signature
fn sort_by_key_hrtb<T, F, K>(slice: &mut [T], f: F)
where
F: Fn(&T) -> &K,
K: Ord,
Why does this make things work? In F: Fn(&T) -> &K,, the lifetime of the output reference is implicitly the same as (or longer than) the lifetime of the input reference. Desugared, this is F: for<'a> Fn(&'a T) -> &'a K,, which says that f should be able to take a reference with any lifetime 'a and return a reference with lifetime (greater than or equal to) 'a. Now we have a method that works exactly how you wanted to use it (except for a pesky &2). (playground link)
Actually, there is one (unsafe) clever construction that probably works, but I haven't vetted it. You can use a wrapper around a raw pointer to a String and then impl Ord for that wrapper so that it dereferences the pointer to do the comparison.3 The return type for the key function would be *const String, so we don't need any lifetimes. This is inherently unsafe though, and I definitely wouldn't recommend it. A (probably) working example is here.
The only reason we need to use &mut dummies here is that sort_by_key_hrtb isn't actually a slice method. If it were, dummies would be automatically borrowed and dereferenced into a slice, so we could call the function like dummies.sort_by_key_hrtb(|d| &d.x);.
Why a wrapper instead of just a pointer? *const T implements Ord, but it does so by comparing the addresses rather than the underlying value (if any), which isn't what we want here.
The sort_by_key function takes ownership of the key:
pub fn sort_by_key<K, F>(&mut self, f: F)
This is why you are getting error E0507.
A simple fix will be to store a reference on your struct so sort_by_key will not take the ownership of your key.
Then you need to had a lifetime to the referred value so she can be dropped when your struct is gone.
struct Dummy<'a> {
x: &'a str,
y: i8,
}
fn main() {
let mut dummies: Vec<Dummy> = Vec::new();
dummies.push(Dummy { x: "a", y: 1 });
dummies.push(Dummy { x: "b", y: 2 });
dummies.sort_by_key(|d| d.x);
dummies.sort_by_key(|d| d.y);
}
I think it's because it's trying to move the String from one struct to another struct.
This works fine
struct Dummy {
x: String,
y: i8
}
fn main() {
let mut dummies: Vec<Dummy> = Vec::new();
dummies.push(Dummy { x: "a".to_string(), y: 1 });
dummies.push(Dummy { x: "b".to_string(), y: 2 });
dummies.sort_by_key(|d| d.x.clone()); // Clone the string
dummies.sort_by_key(|d| d.y); // This is fine
}
The behavior might looks something like this
struct Dummy {
x: String,
y: i8
}
fn main() {
let mut dummies: Vec<Dummy> = Vec::new();
dummies.push(Dummy { x: "a".to_string(), y: 1 });
dummies.push(Dummy { x: "b".to_string(), y: 2 });
let mut temp = Dummy{ x: "c".to_string(), y: 3 };
temp.x = dummies[0].x; // Error[E0507]: cannot move out of borrowed content
}
Using clone() like the example above
struct Dummy {
x: String,
y: i8
}
fn main() {
let mut dummies: Vec<Dummy> = Vec::new();
dummies.push(Dummy { x: "a".to_string(), y: 1 });
dummies.push(Dummy { x: "b".to_string(), y: 2 });
let mut temp = Dummy{ x: "c".to_string(), y: 3 };
temp.x = dummies[0].x.clone(); // Fine
}

Vector of traits (dynamic dispatch) which contains associated type (also dynamic dispatch) [duplicate]

I have a program that involves examining a complex data structure to see if it has any defects. (It's quite complicated, so I'm posting example code.) All of the checks are unrelated to each other, and will all have their own modules and tests.
More importantly, each check has its own error type that contains different information about how the check failed for each number. I'm doing it this way instead of just returning an error string so I can test the errors (it's why Error relies on PartialEq).
My Code So Far
I have traits for Check and Error:
trait Check {
type Error;
fn check_number(&self, number: i32) -> Option<Self::Error>;
}
trait Error: std::fmt::Debug + PartialEq {
fn description(&self) -> String;
}
And two example checks, with their error structs. In this example, I want to show errors if a number is negative or even:
#[derive(PartialEq, Debug)]
struct EvenError {
number: i32,
}
struct EvenCheck;
impl Check for EvenCheck {
type Error = EvenError;
fn check_number(&self, number: i32) -> Option<EvenError> {
if number < 0 {
Some(EvenError { number: number })
} else {
None
}
}
}
impl Error for EvenError {
fn description(&self) -> String {
format!("{} is even", self.number)
}
}
#[derive(PartialEq, Debug)]
struct NegativeError {
number: i32,
}
struct NegativeCheck;
impl Check for NegativeCheck {
type Error = NegativeError;
fn check_number(&self, number: i32) -> Option<NegativeError> {
if number < 0 {
Some(NegativeError { number: number })
} else {
None
}
}
}
impl Error for NegativeError {
fn description(&self) -> String {
format!("{} is negative", self.number)
}
}
I know that in this example, the two structs look identical, but in my code, there are many different structs, so I can't merge them. Lastly, an example main function, to illustrate the kind of thing I want to do:
fn main() {
let numbers = vec![1, -4, 64, -25];
let checks = vec![
Box::new(EvenCheck) as Box<Check<Error = Error>>,
Box::new(NegativeCheck) as Box<Check<Error = Error>>,
]; // What should I put for this Vec's type?
for number in numbers {
for check in checks {
if let Some(error) = check.check_number(number) {
println!("{:?} - {}", error, error.description())
}
}
}
}
You can see the code in the Rust playground.
Solutions I've Tried
The closest thing I've come to a solution is to remove the associated types and have the checks return Option<Box<Error>>. However, I get this error instead:
error[E0038]: the trait `Error` cannot be made into an object
--> src/main.rs:4:55
|
4 | fn check_number(&self, number: i32) -> Option<Box<Error>>;
| ^^^^^ the trait `Error` cannot be made into an object
|
= note: the trait cannot use `Self` as a type parameter in the supertraits or where-clauses
because of the PartialEq in the Error trait. Rust has been great to me thus far, and I really hope I'm able to bend the type system into supporting something like this!
When you write an impl Check and specialize your type Error with a concrete type, you are ending up with different types.
In other words, Check<Error = NegativeError> and Check<Error = EvenError> are statically different types. Although you might expect Check<Error> to describe both, note that in Rust NegativeError and EvenError are not sub-types of Error. They are guaranteed to implement all methods defined by the Error trait, but then calls to those methods will be statically dispatched to physically different functions that the compiler creates (each will have a version for NegativeError, one for EvenError).
Therefore, you can't put them in the same Vec, even boxed (as you discovered). It's not so much a matter of knowing how much space to allocate, it's that Vec requires its types to be homogeneous (you can't have a vec![1u8, 'a'] either, although a char is representable as a u8 in memory).
Rust's way to "erase" some of the type information and gain the dynamic dispatch part of subtyping is, as you discovered, trait objects.
If you want to give another try to the trait object approach, you might find it more appealing with a few tweaks...
You might find it much easier if you used the Error trait in std::error instead of your own version of it.
You may need to impl Display to create a description with a dynamically built String, like so:
impl fmt::Display for EvenError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{} is even", self.number)
}
}
impl Error for EvenError {
fn description(&self) -> &str { "even error" }
}
Now you can drop the associated type and have Check return a trait object:
trait Check {
fn check_number(&self, number: i32) -> Option<Box<Error>>;
}
your Vec now has an expressible type:
let mut checks: Vec<Box<Check>> = vec![
Box::new(EvenCheck) ,
Box::new(NegativeCheck) ,
];
The best part of using std::error::Error...
is that now you don't need to use PartialEq to understand what error was thrown. Error has various types of downcasts and type checks if you do need to retrieve the concrete Error type out of your trait object.
for number in numbers {
for check in &mut checks {
if let Some(error) = check.check_number(number) {
println!("{}", error);
if let Some(s_err)= error.downcast_ref::<EvenError>() {
println!("custom logic for EvenErr: {} - {}", s_err.number, s_err)
}
}
}
}
full example on the playground
I eventually found a way to do it that I'm happy with. Instead of having a vector of Box<Check<???>> objects, have a vector of closures that all have the same type, abstracting away the very functions that get called:
fn main() {
type Probe = Box<Fn(i32) -> Option<Box<Error>>>;
let numbers: Vec<i32> = vec![ 1, -4, 64, -25 ];
let checks = vec![
Box::new(|num| EvenCheck.check_number(num).map(|u| Box::new(u) as Box<Error>)) as Probe,
Box::new(|num| NegativeCheck.check_number(num).map(|u| Box::new(u) as Box<Error>)) as Probe,
];
for number in numbers {
for check in checks.iter() {
if let Some(error) = check(number) {
println!("{}", error.description());
}
}
}
}
Not only does this allow for a vector of Box<Error> objects to be returned, it allows the Check objects to provide their own Error associated type which doesn't need to implement PartialEq. The multiple ases look a little messy, but on the whole it's not that bad.
I'd suggest you some refactoring.
First, I'm pretty sure, that vectors should be homogeneous in Rust, so there is no way to supply elements of different types for them. Also you cannot downcast traits to reduce them to a common base trait (as I remember, there was a question about it on SO).
So I'd use algebraic type with explicit match for this task, like this:
enum Checker {
Even(EvenCheck),
Negative(NegativeCheck),
}
let checks = vec![
Checker::Even(EvenCheck),
Checker::Negative(NegativeCheck),
];
As for error handling, consider use FromError framework, so you will able to involve try! macro in your code and to convert error types from one to another.

General pointer type for `Rc`, `Box`, `Arc`

I have a struct which references a value (because it is ?Sized or very big). This value has to live with the struct, of course.
However, the struct shouldn't restrict the user on how to accomplish that. Whether the user wraps the value in a Box or Rc or makes it 'static, the value just has to survive with the struct. Using named lifetimes would be complicated because the reference will be moved around and may outlive our struct. What I am looking for is a general pointer type (if it exists / can exist).
How can the struct make sure the referenced value lives as long as the struct lives, without specifying how?
Example (is.gd/Is9Av6):
type CallBack = Fn(f32) -> f32;
struct Caller {
call_back: Box<CallBack>,
}
impl Caller {
fn new(call_back: Box<CallBack>) -> Caller {
Caller {call_back: call_back}
}
fn call(&self, x: f32) -> f32 {
(self.call_back)(x)
}
}
let caller = {
// func goes out of scope
let func = |x| 2.0 * x;
Caller {call_back: Box::new(func)}
};
// func survives because it is referenced through a `Box` in `caller`
let y = caller.call(1.0);
assert_eq!(y, 2.0);
Compiles, all good. But if we don't want to use a Box as a pointer to our function (one can call Box a pointer, right?), but something else, like Rc, this wont be possible, since Caller restricts the pointer to be a Box.
let caller = {
// function is used by `Caller` and `main()` => shared resource
// solution: `Rc`
let func = Rc::new(|x| 2.0 * x);
let caller = Caller {call_back: func.clone()}; // ERROR Rc != Box
// we also want to use func now
let y = func(3.0);
caller
};
// func survives because it is referenced through a `Box` in `caller`
let y = caller.call(1.0);
assert_eq!(y, 2.0);
(is.gd/qUkAvZ)
Possible solution: Deref? (http://is.gd/mmY6QC)
use std::rc::Rc;
use std::ops::Deref;
type CallBack = Fn(f32) -> f32;
struct Caller<T>
where T: Deref<Target = Box<CallBack>> {
call_back: T,
}
impl<T> Caller<T>
where T: Deref<Target = Box<CallBack>> {
fn new(call_back: T) -> Caller<T> {
Caller {call_back: call_back}
}
fn call(&self, x: f32) -> f32 {
(*self.call_back)(x)
}
}
fn main() {
let caller = {
// function is used by `Caller` and `main()` => shared resource
// solution: `Rc`
let func_obj = Box::new(|x: f32| 2.0 * x) as Box<CallBack>;
let func = Rc::new(func_obj);
let caller = Caller::new(func.clone());
// we also want to use func now
let y = func(3.0);
caller
};
// func survives because it is referenced through a `Box` in `caller`
let y = caller.call(1.0);
assert_eq!(y, 2.0);
}
Is this the way to go with Rust? Using Deref? It works at least.
Am I missing something obvious?
This question did not solve my problem, since the value is practically unusable as a T.
While Deref provides the necessary functionality, AsRef and Borrow are more appropriate for this situation (Borrow more so than AsRef in the case of a struct). Both of these traits let your users use Box<T>, Rc<T> and Arc<T>, and Borrow also lets them use &T and T. Your Caller struct could be written like this:
use std::borrow::Borrow;
struct Caller<CB: Borrow<Callback>> {
callback: CB,
}
Then, when you want to use the callback field, you need to call the borrow() (or as_ref()) method:
impl<CB> Caller<CB>
where CB: Borrow<Callback>
{
fn new(callback: CB) -> Caller<CB> {
Caller { callback: callback }
}
fn call(&self, x: f32) -> f32 {
(self.callback.borrow())(x)
}
}
It crashes with the current stable compiler (1.1), but not with beta or nightly (just use your last Playpen link and change the "Channel" setting at the top). I believe that support for Rc<Trait> was only partial in 1.1; there were some changes that didn't make it in time. This is probably why your code doesn't work.
To address the question of using Deref for this... if dereferencing the pointer is all you need... sure. It's really just a question of whether or not the trait(s) you've chosen support the operations you need. If yes, great.
As an aside, you can always write a new trait that expresses the exact semantics you need, and implement that for existing types. From what you've said, it doesn't seem necessary in this case.

Resources