Is there an idiomatic way of avoiding `Box::leak` with this code? - memory-leaks

As I continue to learn Rust I'm working on a project which involves extensive use of predicate functions. I've decided to implement these predicates with Rust closures, e.g.:
type Predicate = Box<Fn(&Form) -> bool>.
My program uses boolean logic applied to these predicates. For instance, both and as well as or are applied over the value of these predicates. I've made this work using Box::leak:
struct Form {
name: String,
}
fn and(a: Option<Predicate>, b: Option<Predicate>) -> Option<Predicate> {
if a.is_none() {
return b;
} else if b.is_none() {
return a;
} else {
let a = Box::leak(a.unwrap());
let b = Box::leak(b.unwrap());
return Some(Box::new(move |form: &Form| a(form) && b(form)));
}
}
While this seems to work as I'd like, Box::leak seems non-ideal. I don't know enough about std::rc::Rc and std::cell::RefCell to know if these might help me avoid Box::leak here — employing them might require significant restructuring of my code, but I'd like to at least understand what the idiomatic approach here might be.
Is there a way of avoiding the leak while still maintaining the same functionality?
Here's the complete example:
struct Form {
name: String,
}
type Predicate = Box<Fn(&Form) -> bool>;
struct Foo {
predicates: Vec<Predicate>,
}
impl Foo {
fn and(a: Option<Predicate>, b: Option<Predicate>) -> Option<Predicate> {
if a.is_none() {
return b;
} else if b.is_none() {
return a;
} else {
let a = Box::leak(a.unwrap());
let b = Box::leak(b.unwrap());
return Some(Box::new(move |form: &Form| a(form) && b(form)));
}
}
}
fn main() {
let pred = Foo::and(
Some(Box::new(move |form: &Form| {
form.name == String::from("bar")
})),
Some(Box::new(move |_: &Form| true)),
)
.unwrap();
let foo = Foo {
predicates: vec![pred],
};
let pred = &foo.predicates[0];
let form_a = &Form {
name: String::from("bar"),
};
let form_b = &Form {
name: String::from("baz"),
};
assert_eq!(pred(form_a), true);
assert_eq!(pred(form_b), false);
}

Your code does not need Box::leak and it's unclear why you think it does. The code continues to compile and have the same output if it's removed:
impl Foo {
fn and(a: Option<Predicate>, b: Option<Predicate>) -> Option<Predicate> {
if a.is_none() {
b
} else if b.is_none() {
a
} else {
let a = a.unwrap();
let b = b.unwrap();
Some(Box::new(move |form: &Form| a(form) && b(form)))
}
}
}
The unwraps are non-idiomatic; a more idiomatic solution would use match:
impl Foo {
fn and(a: Option<Predicate>, b: Option<Predicate>) -> Option<Predicate> {
match (a, b) {
(a, None) => a,
(None, b) => b,
(Some(a), Some(b)) => Some(Box::new(move |form| a(form) && b(form))),
}
}
}

Related

Wrap type in enum and return reference

type Id = u8;
struct A {
id: Id,
}
struct B {
id: Id,
}
struct C {
id: Id,
}
struct State {
a_vec: Vec<A>,
b_vec: Vec<B>,
c_vec: Vec<C>,
}
impl State {
fn new() -> Self {
Self {
a_vec: Vec::new(),
b_vec: Vec::new(),
c_vec: Vec::new(),
}
}
fn get_e0(&self, id: Id) -> &E0 {
if let Some(a) = self.a_vec.iter().find(|x| x.id==id) {
&E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id==id) {
&E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
fn get_e0_mut(&mut self, id: Id) -> &mut E0 {
if let Some(a) = self.a_vec.iter_mut().find(|x| x.id==id) {
&mut E0::A(a)
} else if let Some(b) = self.b_vec.iter_mut().find(|x| x.id==id) {
&mut E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
}
enum E0 {
A(A),
B(B),
}
enum E1 {
A(A),
C(C),
}
fn main() {
let state = State::new();
let a0 = A { id: 0 };
let a1 = A { id: 1 };
let b0 = B { id: 2 };
let c0 = C { id: 3 };
state.a_vec.push(a0);
state.a_vec.push(a1);
state.b_vec.push(b0);
state.c_vec.push(c0);
let e5 = state.get_e0(1);
}
I'm looking for a way to implement the function get_e0 and get_e0_mut that wrap several types into an enum so the caller doesn't have to care which of A or B their id relates to, only that they will get an E0. Yet an Vec of E0's seems unfeasible as there might be separate grouping such as E1.
If these functions are not possible then is there another method that could be used to reduce the overhead of searching all the respective Vec's individually each time.
It is guaranteed that the all id's are unique.
You cannot return a reference to a temporary. Instead, you can make your enums generic over their contents. You can therefore use a single enum:
enum E0<T, U> {
A(T),
B(U),
}
You can then use it like this:
fn get_e0(&self, id: Id) -> E0<&A, &B> {
if let Some(a) = self.a_vec.iter().find(|x| x.id == id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id == id) {
E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
fn get_e0_mut(&mut self, id: Id) -> E0<&mut A, &mut B> {
if let Some(a) = self.a_vec.iter_mut().find(|x| x.id == id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter_mut().find(|x| x.id == id) {
E0::B(b)
} else {
panic!("ahh that id doesn't exist everbody panic!!!")
}
}
Thanks to lifetime elision rules, you don't have to specify lifetimes.
Playground link
Note that if you want to avoid the panic, your return type should express the notion that there can be no value found.
You can for example return an Option:
fn get_e0(&self, id: Id) -> Option<E0<&A, &B>> { ... }
Or alter the enum to have a None variant, similar to Option:
enum E0<T, U> {
A(T),
B(U),
None,
}
And use it like this:
fn get_e0(&self, id: Id) -> E0<&A, &B> {
if let Some(a) = self.a_vec.iter().find(|x| x.id==id) {
E0::A(a)
} else if let Some(b) = self.b_vec.iter().find(|x| x.id==id) {
E0::B(b)
} else {
E0::None
}
}
It is most of the time more idiomatic to express such situations using the type system instead of panicking.

Rust: how to tell the borrow checker that the move is depend on a bool?

The following code doesn't pass the borrow checker, for Label-A uses a value that consumed by Label-B, but the code is actually safe: the Label-A is guarded with processed which is only set if Label-B is run.
How can I tell the compiler the dependency, or if I cannot, what's the idiom to solve this issue?
(Making X Copy/Clone is not acceptable, nor making consume taking a reference, neither Rc<X> is appealing (the data structure is already quite complicate))
struct X(i32);
fn consume1(_x: X) {
()
}
fn consume2(_x: X) {
()
}
fn predicate(_x: &X) -> bool {
true
}
pub fn main() {
let xs = vec![X(1), X(2)];
for x in xs {
let mut processed = false;
// for _ in _ {
if predicate(&x) {
consume1(x); // Label-B
processed = true;
}
// } end for
// this for loop here is just to show that the real code
// is more complicated, the consume1() is actually called
// (somehow) inside this inner loop
// some more code
if !processed {
consume2(x); // Label-A
}
}
}
Unless I've misunderstood I think the best option for you is to use 'Option'. This way you can also get rid of that boolean flag.
struct X( i32 );
fn consume1( _x: X ) { }
fn consume2( _x: X ) { }
fn predicate( _x: &X ) -> bool {
true
}
pub fn main( ) {
let xs = vec![ Some( X( 1 ) ), Some( X( 2 ) ) ];
for mut x in xs {
if predicate( x.as_ref( ).unwrap( ) ) {
consume1( x.take( ).unwrap( ) );
}
if let Some( x ) = x {
consume2( x );
}
}
}

How can I get the T from an Option<T> when using syn?

I'm using syn to parse Rust code. When I read a named field's type using field.ty, I get a syn::Type. When I print it using quote!{#ty}.to_string() I get "Option<String>".
How can I get just "String"? I want to use #ty in quote! to print "String" instead of "Option<String>".
I want to generate code like:
impl Foo {
pub set_bar(&mut self, v: String) {
self.bar = Some(v);
}
}
starting from
struct Foo {
bar: Option<String>
}
My attempt:
let ast: DeriveInput = parse_macro_input!(input as DeriveInput);
let data: Data = ast.data;
match data {
Data::Struct(ref data) => match data.fields {
Fields::Named(ref fields) => {
fields.named.iter().for_each(|field| {
let name = &field.ident.clone().unwrap();
let ty = &field.ty;
quote!{
impl Foo {
pub set_bar(&mut self, v: #ty) {
self.bar = Some(v);
}
}
};
});
}
_ => {}
},
_ => panic!("You can derive it only from struct"),
}
My updated version of the response from #Boiethios, tested and used in a public crate, with support of several syntaxes for Option:
Option
std::option::Option
::std::option::Option
core::option::Option
::core::option::Option
fn extract_type_from_option(ty: &syn::Type) -> Option<&syn::Type> {
use syn::{GenericArgument, Path, PathArguments, PathSegment};
fn extract_type_path(ty: &syn::Type) -> Option<&Path> {
match *ty {
syn::Type::Path(ref typepath) if typepath.qself.is_none() => Some(&typepath.path),
_ => None,
}
}
// TODO store (with lazy static) the vec of string
// TODO maybe optimization, reverse the order of segments
fn extract_option_segment(path: &Path) -> Option<&PathSegment> {
let idents_of_path = path
.segments
.iter()
.into_iter()
.fold(String::new(), |mut acc, v| {
acc.push_str(&v.ident.to_string());
acc.push('|');
acc
});
vec!["Option|", "std|option|Option|", "core|option|Option|"]
.into_iter()
.find(|s| &idents_of_path == *s)
.and_then(|_| path.segments.last())
}
extract_type_path(ty)
.and_then(|path| extract_option_segment(path))
.and_then(|path_seg| {
let type_params = &path_seg.arguments;
// It should have only on angle-bracketed param ("<String>"):
match *type_params {
PathArguments::AngleBracketed(ref params) => params.args.first(),
_ => None,
}
})
.and_then(|generic_arg| match *generic_arg {
GenericArgument::Type(ref ty) => Some(ty),
_ => None,
})
}
You should do something like this untested example:
use syn::{GenericArgument, PathArguments, Type};
fn extract_type_from_option(ty: &Type) -> Type {
fn path_is_option(path: &Path) -> bool {
leading_colon.is_none()
&& path.segments.len() == 1
&& path.segments.iter().next().unwrap().ident == "Option"
}
match ty {
Type::Path(typepath) if typepath.qself.is_none() && path_is_option(typepath.path) => {
// Get the first segment of the path (there is only one, in fact: "Option"):
let type_params = typepath.path.segments.iter().first().unwrap().arguments;
// It should have only on angle-bracketed param ("<String>"):
let generic_arg = match type_params {
PathArguments::AngleBracketed(params) => params.args.iter().first().unwrap(),
_ => panic!("TODO: error handling"),
};
// This argument must be a type:
match generic_arg {
GenericArgument::Type(ty) => ty,
_ => panic!("TODO: error handling"),
}
}
_ => panic!("TODO: error handling"),
}
}
There's not many things to explain, it just "unrolls" the diverse components of a type:
Type -> TypePath -> Path -> PathSegment -> PathArguments -> AngleBracketedGenericArguments -> GenericArgument -> Type.
If there is an easier way to do that, I would be happy to know it.
Note that since syn is a parser, it works with tokens. You cannot know for sure that this is an Option. The user could, for example, type std::option::Option, or write type MaybeString = std::option::Option<String>;. You cannot handle those arbitrary names.

How to generalise access to struct fields?

I try to find differences from two streams (represented by iterators) for later analysis, the code below works just fine, but looks a little bit ugly and error prone (copy-paste!) in updating values in update_v? functions. Is there any ways to generalise it assuming that source is matter?
struct Data {};
struct S {
v1: Option<Data>,
v2: Option<Data>
}
...
fn update_v1(diffs: &mut HashMap<u64, Data>, key: u64, data: Data) {
match diffs.entry(key) {
Entry::Vacant(v) => {
let variant = S {
v1: Some(data),
v2: None
};
v.insert(variant);
},
Entry::Occupied(e) => {
let new_variant = Some(data);
if e.get().v2 == new_variant {
e.remove();
} else {
let existing = e.into_mut();
existing.v1 = new_variant;
}
}
}
}
fn update_v2(diffs: &mut HashMap<u64, Data>, key: u64, data: Data) {
match diffs.entry(key) {
Entry::Vacant(v) => {
let variant = S {
v2: Some(data),
v1: None
};
v.insert(variant);
},
Entry::Occupied(e) => {
let new_variant = Some(data);
if e.get().v1 == new_variant {
e.remove();
} else {
let existing = e.into_mut();
existing.v2 = new_variant;
}
}
}
}
Instead of writing one function for each field, receive a pair of Fns as arguments:
fn(&S) -> Option<Data>, which can be used to replace this condition
if e.get().v1 == new_variant { /* ... */ }
with this
if getter(e.get()) == new_variant { /* ... */ }
fn(&mut S, Option<Data>) -> (), which replaces
existing.v2 = new_variant;
with
setter(&mut existing, new_variant);
Then on the call site you pass a couple lambdas like this
Getter: |d| d.v1
Setter: |s, d| s.v2 = d
Or vice-versa for the other function.
And if you want to keep the update_v1 and update_v2 function names, just write those as wrappers to this new generalized function that automatically pass the proper lambdas.
You can create a trait to facilitate different ways of accessing the structure.
trait SAccessor {
type RV;
fn new(Data) -> S;
fn v2(&S) -> &Self::RV;
fn v1_mut(&mut S) -> &mut Self::RV;
}
struct DirectSAccessor;
impl SAccessor for DirectSAccessor {
type RV = Option<Data>;
fn new(data: Data) -> S {
S {
v1: Some(data),
v2: None
}
}
fn v2(s: &S) -> &Self::RV {
&s.v2
}
fn v1_mut(s: &mut S) -> &mut Self::RV {
&mut s.v1
}
}
fn update<A>(diffs: &mut HashMap<u64, S>, key: u64, data: Data)
where A: SAccessor<RV=Option<Data>>
{
match diffs.entry(key) {
Entry::Vacant(v) => {
let variant = A::new(data);
v.insert(variant);
},
Entry::Occupied(e) => {
let new_variant = Some(data);
if A::v2(e.get()) == &new_variant {
e.remove();
} else {
let existing = e.into_mut();
*A::v1_mut(existing) = new_variant;
}
}
}
}
// ...
// update::<DirectSAccessor>( ... );
Full code

How to group 'Option' assignments in Rust?

I have a block of code where multiple optional variables need to be assigned at once. There is very little chance any of the values will be None, so individually handing each failed case isn't especially useful.
Currently I write the checks like this:
if let Some(a) = foo_a() {
if let Some(b) = foo_b() {
if let Some(c) = foo_c() {
if let Some(d) = foo_d() {
// code
}
}
}
}
It would be convenient if it was possible to group assignments. Without this, adding a new variable indents the block one level, making for noisy diffs and causes unnecessarily deep indentation:
if let Some(a) = foo_a() &&
let Some(b) = foo_b() &&
let Some(c) = foo_c() &&
let Some(d) = foo_d()
{
// code
}
Is there a way to assign multiple Options in one if statement?
Some details worth noting:
The first function that fails should short circuit and not call the others. Otherwise, it could be written like this:
if let (Some(a), Some(b), Some(c), Some(d)) = (foo_a(), foo_b(), foo_c(), foo_d()) {
// Code
}
Deep indentation could be avoided using a function, but I would prefer not to do this since you may not want to have the body in a different scope...
fn my_function(a: Foo, b: Foo, c: Foo, d: Foo) {
// code
}
if let Some(a) = foo_a() {
if let Some(b) = foo_b() {
if let Some(c) = foo_c() {
if let Some(d) = foo_d() {
my_function(a, b, c, d);
}
}
}
}
As #SplittyDev said, you can create a macro to get the functionality you want. Here is an alternate macro-based solution which also retains the short-circuiting behaviour:
macro_rules! iflet {
([$p:pat = $e:expr] $($rest:tt)*) => {
if let $p = $e {
iflet!($($rest)*);
}
};
($b:block) => {
$b
};
}
fn main() {
iflet!([Some(a) = foo_a()] [Some(b) = foo_b()] [Some(c) = foo_c()] {
println!("{} {} {}", a, b, c);
});
}
Playground
The standard library doesn't include that exact functionality, but the language allows you to create the desired behavior using a small macro.
Here's what I came up with:
macro_rules! all_or_nothing {
($($opt:expr),*) => {{
if false $(|| $opt.is_none())* {
None
} else {
Some(($($opt.unwrap(),)*))
}
}};
}
You can feed it all your options and get some tuple containing the unwrapped values if all values are Some, or None in the case that any of the options are None.
The following is a brief example on how to use it:
fn main() {
let foo = Some(0);
let bar = Some(1);
let baz = Some(2);
if let Some((a, b, c)) = all_or_nothing!(foo, bar, baz) {
println!("foo: {}; bar: {}; baz: {}", a, b, c);
} else {
panic!("Something was `None`!");
}
}
Here's a full test-suite for the macro: Rust Playground
My first inclination was to do something similar to swizard's answer, but to wrap it up in a trait to make the chaining cleaner. It's also a bit simpler without the need for extra function invocations.
It does have the downside of increasing the nesting of the tuples.
fn foo_a() -> Option<u8> {
println!("foo_a() invoked");
Some(1)
}
fn foo_b() -> Option<u8> {
println!("foo_b() invoked");
None
}
fn foo_c() -> Option<u8> {
println!("foo_c() invoked");
Some(3)
}
trait Thing<T> {
fn thing<F, U>(self, f: F) -> Option<(T, U)> where F: FnOnce() -> Option<U>;
}
impl<T> Thing<T> for Option<T> {
fn thing<F, U>(self, f: F) -> Option<(T, U)>
where F: FnOnce() -> Option<U>
{
self.and_then(|a| f().map(|b| (a, b)))
}
}
fn main() {
let x = foo_a()
.thing(foo_b)
.thing(foo_c);
match x {
Some(((a, b), c)) => println!("matched: a = {}, b = {}, c = {}", a, b, c),
None => println!("nothing matched"),
}
}
Honestly, someone should notice about Option being an applicative functor :)
The code will be quite ugly without currying support in Rust, but it works and it shouldn't make a noisy diff:
fn foo_a() -> Option<isize> {
println!("foo_a() invoked");
Some(1)
}
fn foo_b() -> Option<isize> {
println!("foo_b() invoked");
Some(2)
}
fn foo_c() -> Option<isize> {
println!("foo_c() invoked");
Some(3)
}
let x = Some(|v| v)
.and_then(|k| foo_a().map(|v| move |x| k((v, x))))
.and_then(|k| foo_b().map(|v| move |x| k((v, x))))
.and_then(|k| foo_c().map(|v| move |x| k((v, x))))
.map(|k| k(()));
match x {
Some((a, (b, (c, ())))) =>
println!("matched: a = {}, b = {}, c = {}", a, b, c),
None =>
println!("nothing matched"),
}
You can group the values using the '?' operator to return an Option of a tuple with the required values. If on of then is None, the group_options function will return None.
fn foo_a() -> Option<u8> {
println!("foo_a() invoked");
Some(1)
}
fn foo_b() -> Option<u8> {
println!("foo_b() invoked");
None
}
fn foo_c() -> Option<u8> {
println!("foo_c() invoked");
Some(3)
}
fn group_options() -> Option<(u8, u8, u8)> {
let a = foo_a()?;
let b = foo_b()?;
let c = foo_c()?;
Some((a, b, c))
}
fn main() {
if let Some((a, b, c)) = group_options() {
println!("{}", a);
println!("{}", b);
println!("{}", c);
}
}

Resources