I've been learning rust for a while and loving it. I've hit a wall though trying to do something which ought to be simple and elegant so I'm sure I'm missing the obvious.
So I'm parsing JavaScript using the excellent RESSA crate and end up with an AST which is a graph of structs defined by the crate. Now I need to traverse this many times and 'visit' certain nodes with my logic. So I've written a traverser that does that but when it hits a certain nodes it needs to call a callback. In my niavity, I thought I'd define a struct with an attribute for every type with an Option<Fn()> value. In my traverser, I check for the Some value and call it. This works fine but it's ugly because I have to populate this enormous struct with dozens of attributes most of which are None because I'm not interested in those types. Then I thought traits, I'd define a trait 'Visit' which defines the function with a default implementation that does nothing. Then I can just redefine the trait implementation with my desired implementation but this is no good because all the types must have an implementation and then the implementation cannot be redefined. Is there as nice way I can just provide a specific implementation for a few types and leave the rest as default or check for the existence of a function before calling it ? I must be missing an idiomatic way to do this.
You can look at something like syn::Visit, which is a visitor in a popular Rust AST library, for inspiration.
The Visit trait is implemented by the visitor only, and has one method for each node type, with the default implementation only visiting the children:
// this snippet has been slightly altered from the source
pub trait Visit<'ast> {
fn visit_expr(&mut self, i: &'ast Expr) {
visit_expr(self, i);
}
fn visit_expr_array(&mut self, i: &'ast ExprArray) {
visit_expr_array(self, i);
}
fn visit_expr_assign(&mut self, i: &'ast ExprAssign) {
visit_expr_assign(self, i);
}
// ...
}
pub fn visit_expr<'ast, V>(v: &mut V, node: &'ast Expr)
where
V: Visit<'ast> + ?Sized,
{
match node {
Expr::Array(_binding_0) => v.visit_expr_array(_binding_0),
Expr::Assign(_binding_0) => v.visit_expr_assign(_binding_0),
// ...
}
}
pub fn visit_expr_array<'ast, V>(v: &mut V, node: &'ast ExprArray)
where
V: Visit<'ast> + ?Sized,
{
for el in &node.elems {
v.visit_expr(el);
}
}
// ...
With this pattern, you can create a visitor where you only implement the methods you need, and whatever you don't implement will just get the default behavior.
Additionally, because the default methods call separate functions that do the default behavior, you can call those within your custom visitor methods if you need to invoke the default behavior of visiting the children. (Rust doesn't let you invoke default implementations of an overriden trait method directly.)
So for example, a visitor to print all array expressions in a Rust program using syn::Visit could look like:
struct MyVisitor;
impl Visit<'ast> for MyVisitor {
fn visit_expr_array(&mut self, i: &'ast ExprArray) {
println("{:?}", i);
// call default visitor method to visit this node's children as well
visit_expr_array(i);
}
}
fn main() {
let root = get_ast_root_node();
MyVisitor.visit_expr(&root);
}
Related
I have two types, call them Reusable and SingleUse, such that it is possible to construct a SingleUse from a Reusable. I'd like to enforce the invariant that the former can be reused while the latter is consumed by accepting a &Reusable or a SingleUse parameter. Something like:
let re = Reusable{};
let su = SingleUse{source:re};
do_something(&re);
do_something(&re);
do_something(su);
I can accomplish this by defining a From<&Reusable> for SingleUse and then taking a Into<SingleUse> for the do_something() parameter, but I'm not certain whether or not this is a misuse of From/Into. Most examples I've seen involving From don't mention using references. This article has an example of a From<&'a [T]> but still suggests that From/Into are intended to operate on values (and contrasts that with other traits like AsRef intended to operate on references).
So I'm hoping to sanity-check that using From on a reference type is appropriate, or if not if there's a more idiomatic way to model this Reusable/SingleUse relationship?
Full example (playground):
#[derive(Clone, Debug)]
struct Reusable { }
#[derive(Debug)]
struct SingleUse {
#[allow(dead_code)]
source: Reusable,
}
impl From<&Reusable> for SingleUse {
fn from(source: &Reusable) -> Self {
SingleUse{source: source.clone()}
}
}
fn do_something(input: impl Into<SingleUse>) {
println!("Saw:{:?}", input.into());
}
fn main() {
let re = Reusable{};
let su = SingleUse{source: re.clone()};
do_something(&re);
do_something(&re);
do_something(su);
}
std implements From<&str> for {Box<str>,Rc<str>,String,Arc<str>,Vec<u8>}, From<&String> for String, From<&[T]> for {Box<[T]>,Rc<[T]>,Arc<[T]>,Vec<T>} and some more. Especially note the From<&String> for String (essentially Clone) that is pretty similar to your situation. So I'd say this is fine.
This question already has an answer here:
Rust equivalent to Swift's extension methods to a protocol?
(1 answer)
Closed 8 months ago.
I want to create a new iterator method such as:
let test_data = vec![1,2,3,1,1,1,1];
let indexes_with_val_1 = test_data.iter().find_all(|element| element == 1).unwrap();
assert_eq!(indexes_with_val_1, vec!(0,3,4,5,6));
So essentially I want to add a new method to the std::iter::Iterator trait but can't find examples that work for this
The logic is not an issue as I have a free function which works fine, I just would like to be able to use it as I have in the code example for better ergonomics.
You can use a design pattern called extension traits. You can't extend the Iterator trait, but you can write a new one. Here's what we're going to do.
Write a new trait, IteratorExt, which has your custom method in it.
Write a blanket impl that implements IteratorExt for any type that implements Iterator.
Import IteratorExt to get access to your extension function.
For example, we can add a simple function called my_extension to iterators like so
trait IteratorExt {
fn my_extension(self) -> Self;
}
impl<T: Iterator> IteratorExt for T {
fn my_extension(self) -> Self {
println!("Hey, it worked!");
self
}
}
pub fn main() {
let x = vec!(1, 2, 3, 4);
let y = x.iter().my_extension().map(|x| x + 1).collect::<Vec<_>>();
println!("{:?}", y);
}
The only downside is that you have to import the new trait to use it. So if you want to use my_extension in another file, you have to import IteratorExt specifically in order to do so.
In my experience, the Rust community is somewhat divided on whether this is legitimate practice or whether it's a hack to be avoided, so your mileage may vary.
There exist the extension trait pattern for that.
The idea is that you create a trait, usually named TraitExt by convention, and implement it for all implementations of Trait:
pub trait IteratorExt {
fn my_iterator_extension(&self);
}
impl<I: Iterator + ?Sized> IteratorExt for I {
fn my_iterator_extension(&self) {
// Do work.
}
}
my_iterator.my_iterator_extension();
I want to offer a safe API like below FooManager. It should be able to store arbitrary user-defined values that implement a trait Foo. It should also be able to hand them back later - not as trait object (Box<dyn Foo>) but as the original type (Box<T> where T: Foo). At least conceptually it should be possible to offer this as a safe API, by using generic handles (Handle<T>), see below.
Additional criteria:
The solution should work in stable Rust (internal usage of unsafe blocks is perfectly okay though).
I don't want to modify the trait Foo, as e.g. suggested in How to get a reference to a concrete type from a trait object?. It should work without adding a method as_any(). Reasoning: Foo shouldn't have any knowledge about the fact that it might be stored in containers and be restored to the actual type.
trait Foo {}
struct Handle<T> {
// ...
}
struct FooManager {
// ...
}
impl FooManager {
// A real-world API would complain if the value is already stored.
pub fn keep_foo<T: Foo>(&mut self, foo: Box<T>) -> Handle<T> {
// ...
}
// In a real-world API this would return an `Option`.
pub fn return_foo<T: Foo>(&mut self, handle: Handle<T>) -> Box<T> {
// ...
}
}
I came up with this (Rust Playground) but not sure if there's a better way or if it's safe even. What do you think of that approach?
I have a public trait, Parser, that defines an external interface. I then have a private ParserImpl struct that implements the methods (actually, I have several implementations, which is the idea behind using the trait to abstract away).
use std::io;
pub trait Parser {
// ...omitted
}
struct ParserImpl<R: io::Read> {
// ...omitted
stream: R,
}
impl<R: io::Read> ParserImpl<R> {
// ...methods
fn new(stream: R) -> ParserImpl<R> {
ParserImpl {
// ...omitted
stream: stream,
}
}
}
impl<R: io::Read> Parser for ParserImpl<R> {
// ...methods
}
To create a parser instance, I use a function to hide ParserImpl.
pub fn make_parser<'a, R>(stream: R) -> Box<Parser + 'a>
where
R: io::Read + 'a,
{
Box::new(ParserImpl::new(stream))
}
This is all well and good... and it works... but the make_parser function troubles me. I feel that there must be a simpler way to approach this and like I'm missing something important, as this seems like a potential pitfall whenever using a trait like io::Read to abstract away the source of data.
I understand the need to specify lifetimes (Parameter type may not live long enough?) but I am a bit stumped on whether I can have both a clean and simple interface, and also use a trait like io::Read.
Is there a "cleaner," or perhaps more idiomatic way, to use traits like io::Read that I am missing? If not, that's okay, but I'm pretty new to Rust and when I wrote the above function I kept thinking "this can't be right..."
To make this sample runnable, here's a main:
fn main() {
use std::fs;
let file: fs::File = fs::File::open("blabby.txt").unwrap();
let parser = make_parser(file);
}
That is the idiomatic way of writing the code that has that meaning, but you may not want that meaning.
For example, if you don't need to create a boxed trait object, you can just return the parameterized value directly, or in this case just use the result of ParserImpl::new. This is my default form until I know I need dynamic dispatch provided by some trait object.
You could also require the 'static lifetime instead of introducing a new lifetime 'a, but this reduces the range of allowed types that you can pass into make_parser:
pub fn make_parser<R>(stream: R) -> Box<Parser>
where
R: io::Read + 'static,
{
Box::new(ParserImpl::new(stream))
}
When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.
In C I would just use a void pointer, C-like example:
struct SomeTool {
int type;
void *custom_data;
};
void invoke(SomeTool *tool) {
StructOnlyForThisTool *data = malloc(sizeof(*data));
/* ... fill in the data ... */
tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
StructOnlyForThisTool *data = tool.custom_data;
if (data.foo_bar) { /* do something */ }
}
When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:
struct SomeTool {
type: i32,
custom_data: Option<Box<Any>>,
};
fn invoke(tool: &mut SomeTool) {
let data = StructOnlyForThisTool { /* my custom data */ }
/* ... fill in the data ... */
tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
if data.foo_bar { /* do something */ }
}
There is one line here which I'd like to be able to write in a more compact way:
tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()
While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.
By convention, the uses of unwrap here aren't dangerous because:
While only some tools define custom data, the ones that do always define it.
When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
Any time these conventions aren't followed, its a bug and should panic.
Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?
Some possible options:
Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.
Update 1): since asking this question a point I didn't include seems relevant.
There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.
Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.
If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.
An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:
You no longer need an explicit type field because the run-time type information is incorporated in the trait object.
Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.
The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.
Here is an example implementation:
struct SomeTool {
impl_: Box<SomeToolImpl>,
}
impl SomeTool {
fn execute(&mut self) {
self.impl_.execute();
}
}
trait SomeToolImpl {
fn execute(&mut self);
}
struct SpecificTool1 {
foo_bar: bool
}
impl SpecificTool1 {
pub fn new(foo_bar: bool) -> SomeTool {
let my_data = SpecificTool1 { foo_bar: foo_bar };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool1 {
fn execute(&mut self) {
println!("I am {}", self.foo_bar);
}
}
struct SpecificTool2 {
num: u64
}
impl SpecificTool2 {
pub fn new(num: u64) -> SomeTool {
let my_data = SpecificTool2 { num: num };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool2 {
fn execute(&mut self) {
println!("I am {}", self.num);
}
}
pub fn main() {
let mut tool1: SomeTool = SpecificTool1::new(true);
let mut tool2: SomeTool = SpecificTool2::new(42);
tool1.execute();
tool2.execute();
}
Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.