Can I implement a trait while capturing the environment?

Can I implement a trait while capturing the environment? - rust

I'm trying to implement A* search for Advent of Code 2019 (Yes, Slowpoke, I know). I've started like this:
fn find_path(start: Coords, goal: Coords, map: &Vec<Vec<char>>) -> Vec<Coords> {
struct Node {
distance: u32,
pos: Coords,
}
impl PartialEq for Node {
fn eq(&self, other: &Self) -> bool {
self.distance + manhattan(self.pos, goal) == other.distance + manhattan(other.pos, goal)
}
}
...
let mut edge = BinaryHeap::new();
edge.push(Node{distance: 0, pos: start});
...
Coords is a struct with an x and an y. The problem here is that I can't use goal in the trait, because it's not in scope. A closure would be able to capture it, but I am doubtful whether I can use a closure instead of a fn here. If so, what's the syntax? If not, is there a fundamental reason why it can't be done? I wasn't able to find an answer online.
I know the simple solution is to include goal in Node, but it's redundant because I'll be creating thousands of Node during A*, all of which will have the same goal, wasting memory and CPU cycles. In principle, goal could be a single global variable, but that's an untidy option.
Even though I'm sure including goal in Node would work fine in practice, I'd rather not.
Is there another idiomatic way of accomplishing what I'm trying to do?

No, you cannot capture any environment in an impl block. Closures capture the environment, so you cannot use a closure as a function in an impl block.
Functions and methods are designed to be called from any context, so there's no guarantee that there even is an environment to be captured. The fact that we can declare types, functions, methods, etc. inside of another function is basically a syntax nicety.
I'd probably create a type that wraps Node and goal:
struct Foo(Node, Coord);
impl Foo {
fn value(&self) -> WhateverType {
self.0.distance + manhattan(self.0.pos, self.1)
}
}
impl PartialEq for Foo {
fn eq(&self, other: &Self) -> bool {
self.value() == other.value()
}
}
See also:
The petgraph crate
Implement graph-like datastructure in Rust
Which algorithm from petgraph will find the shortest path from A to B?

The reason why I need PartialEq is that the Nodes are subsequently going into a BinaryHeap. If the BinaryHeap had an option to provide a custom comparator, that would be perfect: I wouldn't need Node to be Ord, and I could let goal reside in that comparator (closure).
It seems this is being considered: https://github.com/rust-lang/rust/pull/69454
In the meantime, there's a crate that provides that functionality: https://crates.io/crates/binary-heap-plus
For now, I'm going to accept the overhead of goal inside Node but it's good to know my options.

Related

Associated type for ndarray arguments in rust

I want to create an interface for a (numeric) algorithm for which I want to provide an implementation with ndarray and similar libraries (let's say pytorch bindings)
struct A<D> {
array: D
}
trait<D> T {
type ArgType;
fn foo(&mut self, other: &ArgType);
}
The type of the second argument other should depend on the selected generic type D. With ndarray, there are two types of arrays–those owning their data and views that share their data. The best way is to accept both types seems to be something like the following:
fn bar<S>(a: &ArrayBase<S, Ix2>)
where
S: Data<Elem = f64> {}
For the implementation of the trait T for A using ndarray, that would mean that I need something like this
use ndarray::{prelude::*, Data};
impl T<Array2<f64>> for A<Array2<f64>> {
type ArgType<S>=ArrayBase<S: Data<Elem=f64>, Ix2>;
fn foo(&mut self, other: &Self::ArgType){
///
}
}
Then, however, how would I add the new template parameter to foo. More importantly, generic associated types are not allowed in stable. I assume it would be simpler if there were a trait in ndarray that defined all methods, however, they are implemented directly for the ArrayBase type. I thought about writing a thin abstraction layer but that would be too complex (even if I only use a small subset of methods only). The asarray trait seemed to be a promising solution but it requires an lifetime parameter too (is there even a concept of associated traits?).
What would be the recommended way of handling this kind of situation; is there an easy way?

Maybe this approach is what you want:
use ndarray::{ArrayBase, Ix2};
trait TheAlgorithm<OtherArg> {
fn calculate(&self, other: &OtherArg) -> f64;
}
impl<ReprSelf, ReprOther> TheAlgorithm<ArrayBase<ReprOther, Ix2>>
for ArrayBase<ReprSelf, Ix2>
where
ReprSelf: ndarray::Data<Elem = f64>,
ReprOther: ndarray::Data<Elem = f64>,
{
fn calculate(&self, other: &ArrayBase<ReprOther, Ix2>) -> f64 {
// dummy calculation
self.view()[(0, 0)] + other.view()[(0, 0)]
}
}
This lets you call the calculate() method on either owned arrays or views, and the other argument can be owned or a view in either case.

How to achieve encapsulation of struct fields without borrowing the struct as a whole

My question has already been somewhat discussed here.
The problem is that I want to access multiple distinct fields of a struct in order to use them, but I don't want to work on the fields directly. Instead I'd like to encapsulate the access to them, to gain more flexibility.
I tried to achieve this by writing methods for this struct, but as you can see in the question mentioned above or my older question here this approach fails in Rust because the borrow checker will only allow you to borrow different parts of a struct if you do so directly, or if you use a method borrowing them together, since all the borrow checker knows from its signature is that self is borrowed and only one mutable reference to self can exist at any time.
Losing the flexibility to borrow different parts of a struct as mutable at the same time is of course not acceptable. Therefore I'd like to know whether there is any idiomatic way to do this in Rust.
My naive approach would be to write a macro instead of a function, performing (and encapsulating) the same functionality.
EDIT:
Because Frxstrem suggested that the question which I linked to at the top may answer my question I want to make it clear, that I'm not searching for some way to solve this. My question is which of the proposed solutions (if any) is the right way to do it?

After some more research it seems that there is no beautiful way to achieve partial borrowing without accessing the struct fields directly, at least as of now.
There are, however, multiple imperfect solutions to this which I'd like to discuss a little, so that anyone who might end up here may weigh the pros and cons for themself.
I'll use the code from my original question as an example to illustrate them here.
So let's say you have a struct:
struct GraphState {
nodes: Vec<Node>,
edges: Vec<Edge>,
}
and you're trying to do borrow one part of this struct mutably and the other immutably:
// THIS CODE DOESN'T PASS THE BORROW-CHECKER
impl GraphState {
pub fn edge_at(&self, edge_index: u16) -> &Edge {
&self.edges[usize::from(edge_index)]
}
pub fn node_at_mut(&mut self, node_index: u8) -> &mut Node {
&mut self.nodes[usize::from(node_index)]
}
pub fn remove_edge(&mut self, edge_index: u16) {
let edge = self.edge_at(edge_index); // first (immutable) borrow here
// update the edge-index collection of the nodes connected by this edge
for i in 0..2 {
let node_index = edge.node_indices[i];
self.node_at_mut(node_index).remove_edge(edge_index); // second (mutable)
// borrow here -> ERROR
}
}
}
}
But of course this fails, since you're not allowed to borrow self as mutable and immutable at the same time.
So to get around this you could simply access the fields directly:
impl GraphState {
pub fn remove_edge(&mut self, edge_index: u16) {
let edge = &self.edges[usize::from(edge_index)];
for i in 0..2 {
let node_index = edge.node_indices[i];
self.nodes[usize::from(node_index)].remove_edge(edge_index);
}
}
}
This approach works, but it has two major drawbacks:
The accessed fields need to be public (at least if you want to allow access to them from another scope). If they're implementation details that you'd rather have private you're in bad luck.
You always need to operate on the fields directly. This means that code like usize::from(node_index) needs to be repeated everywhere, which makes this approach both brittle and cumbersome.
So how can you solve this?
A) Borrow everything at once
Since borrowing self mutably multiple times is not allowed, mutably borrowing all the parts you want at once instead is one straightforward way to solve this problem:
pub fn edge_at(edges: &[Edge], edge_index: u16) -> &Edge {
&edges[usize::from(edge_index)]
}
pub fn node_at_mut(nodes: &mut [Node], node_index: u8) -> &mut Node {
&mut nodes[usize::from(node_index)]
}
impl GraphState {
pub fn data_mut(&mut self) -> (&mut [Node], &mut [Edge]) {
(&mut self.nodes, &mut self.edges)
}
pub fn remove_edge(&mut self, edge_index: u16) {
let (nodes, edges) = self.data_mut(); // first (mutable) borrow here
let edge = edge_at(edges, edge_index);
// update the edge-index collection of the nodes connected by this edge
for i in 0..2 {
let node_index = edge.node_indices[i];
node_at_mut(nodes, node_index).remove_edge(edge_index); // no borrow here
// -> no error
}
}
}
}
It's clearly a workaround and far from ideal, but it works and it allows you to keep the fields themselves private (though you'll probably need to expose the implementation somewhat, as the user has to hand the necessary data to the other functions by hand).
B) Use a macro
If code reuse is all you worry about and visibility is not an issue to you you can write macros like these:
macro_rules! node_at_mut {
($this:ident, $index:expr) => {
&mut self.nodes[usize::from($index)]
}
}
macro_rules! edge_at {
($this:ident, $index:expr) => {
&mut self.edges[usize::from($index)]
}
}
...
pub fn remove_edge(&mut self, edge_index: u16) {
let edge = edge_at!(self, edge_index);
// update the edge-index collection of the nodes connected by this edge
for i in 0..2 {
let node_index = edge.node_indices[i];
node_at_mut!(self, node_index).remove_edge(edge_index);
}
}
}
If your fields are public anyway I'd probably go with this solution, as it seems the most elegant to me.
C) Go unsafe
What we want to do here is obviously safe.
Sadly the borrow checker cannot see this, since the function signatures tell him no more than that self is being borrowed. Luckily Rust allows us to use the unsafe keyword for cases like these:
pub fn remove_edge(&mut self, edge_index: u16) {
let edge: *const Edge = self.edge_at(edge_index);
for i in 0..2 {
unsafe {
let node_index = (*edge).node_indices[i];
self.node_at_mut(node_index).remove_edge(edge_index); // first borrow here
// (safe though since remove_edge will not invalidate the first pointer)
}
}
}
This works and gives us all the benefits of solution A), but using unsafe for something that could easily be done without, if only the language had some syntax for actual partial borrowing, seems a bit ugly to me. On the other hand it may (in some cases) be preferable to solution A), due to its sheer clunkyness...
EDIT: After some thought I realised, that we know this approach to be safe here, only because we know about the implementation. In a different use case the data in question might actually be contained in one single field (say a Map), even if it looks like we are accessing two very distinct kinds of data when calling from outside.
This is why this last approach is rightfully unsafe, as there is no way for the borrow checker to check whether we are actually borrowing different things without exposing the private fields, rendering our effort pointless.
Contrary to my initial belief, this couldn't even really be changed by expanding the language. The reason is that one would still need to expose information about private fields in some way for this to work.
After writing this answer I also found this blog post which goes a bit deeper into possible solutions (and also mentions some advanced techniques that didn't come to my mind, none of which is universally applicable though). If you happen to know another solution, or a way to improve the ones proposed here, please let me know.

Is there any way to implement Into<&str> or From<T> for &str?

Was writing some code in Rust trying to define a CLI, using the (otherwise pretty great) crate clap, and run into a rather critical issue. Methods of App accept an Into<&'help str>, and i've been unable to find a way to implement this trait.
Indeed from what i understand, it is absolutely unimplementable:
struct JustWorkDamnIt {
string: String
}
impl From<JustWorkDamnIt> for &str {
fn from(arg: JustWorkDamnIt) -> Self {
return arg.string.as_str()
}
}
...which yields:
error[E0515]: cannot return value referencing local data `arg.string`
--> src/cmd/interactive.rs:25:16
|
25 | return arg.string.as_str()
| ----------^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `arg.string` is borrowed here
Interestingly enough, however, returning a literal compiles just fine (which i reckon is why clap doesn't mind using this trait). Presumably that's because the literal is compiled to some static region of memory and therefore isn't owned by the function:
impl From<JustWorkDamnIt> for &str {
fn from(arg: JustWorkDamnIt) -> Self {
return "literal"
}
}
But, i mean, surely there's a way to implement this trait and return dynamic strings? Maybe some clever use of Box<> or something, idk. I believe i've tried all the things i could think of.
And if there isn't a way, then this seems like a pretty glaring flaw for Rust - traits which can be declared and used in function headers, but cannot be actually implemented meaningfully (there's not much of a point in returning a literal). If this turns out to be the case i'll create an issue on the rust-lang repository for this flow, but first i wanted to sanity-check my findings here.
UPD: Thanks for the comments, made me think more in-depth about this issue.
The problem has to do with lifetimes, it seems. I apologize for not showing the entire code, i thought the snippets i shared would describe the problem sufficiently, but in hindsight it does make sense that the context would be important with Rust's lifetimes at play.
I did find a "solution" for this particular instance of the problem. Since the code in question will only run once per executable start, i can just Box::leak the &'static str and call it a day. Still, i would like to figure out if there's a more general solution which could be used for often-created dynamic strings.
impl Cmd for InteractiveCmd {
fn build_args_parser<'a, 'b>(&'b self, builder: App<'a>) -> App<'a> {
// leak() works here but i'm curious if there's a better way
let staticStr : &'static str = Box::leak(Box::from(format!("Interactive {} shell", APPNAME).as_str()));
let rv = builder
// pub fn about<S: Into<&'b str>>(mut self, about: S) -> Self
.about(staticStr)
.version("0.1");
return rv;
}
}
fn main() {
let interactive = InteractiveCmd::new();
let mut app = App::new(APPNAME)
.version(APPVER)
.author(AUTHORS)
.subcommand(interactive.build_args_parser(App::new("interactive")));
}
Currently i am faced with 2 points of confusion here:
Why is there no impl From<String> for &str? The compiler claims that one exists for impl From<String> for &mut str, and i'm not seeing the importance of mut here.
...and if there is a good reason to not impl From<String> for &str, would it make sense to somehow discourage the usage of Into<&str> in the public APIs of libraries?
Or maybe the issue is with my build_args_parser function and it's just not an option to offload the work to it as far as Rust is concerned?

Seems like you're trying to convert a String into a &'a str. You can do it like this:
use clap::App;
fn main() {
let s = format!("Interactive {} shell", "Testing");
let app = App::new("Testing")
.about(&*s); // or `.about(s.as_str());` it's the same
}
Here's the signature of the function that we want to call:
pub fn about<S: Into<&'b str>>(self, about: S) -> Self
So the about parameter must implement trait Into<&'b str>. According to std::convert::Into, we know that &'b str does implement trait Into<&'b str>. So now we just need a way to convert String into &'b str. The std::string::String tell us that there are two ways: use either s.as_str() or &*s.

Why doesn't String implement From<&String>?

Background
I know that in Rust people prefer &str rather than &String. But in some case we were only given &String.
One example is when you call std::iter::Iterator::peekable. The return value is a Peekable<I> object that wraps the original iterator into it and gives you one extra method peek.
The point here is that peek only gives you a reference to the iterator item. So if you have an iterator that contains Strings, you only have &String in this case. Of cause, you can easily use as_str to get a &str but in the code I will show below it is equivalent to a call to clone.
The question
This code
#[derive(Debug)]
struct MyStruct(String);
impl MyStruct {
fn new<T>(t: T) -> MyStruct
where
T: Into<String>,
{
MyStruct(t.into())
}
}
fn main() {
let s: String = "Hello world!".into();
let st: MyStruct = MyStruct::new(&s);
println!("{:?}", st);
}
doesn't compile because String doesn't implement From<&String>. This is not intuitive.
Why does this not work? Is it just a missing feature of the standard library or there are some other reasons that prevent the standard library from implementing it?
In the real code, I only have a reference to a String and I know to make it work I only need to call clone instead, but I want to know why.

To solve your problem, one could imagine adding a new generic impl to the standard library:
impl<'a, T: Clone> From<&'a T> for T { ... }
Or to make it more generic:
impl<B, O> From<B> for O where B: ToOwned<Owned=O> { ... }
However, there are two problems with doing that:
Specialization: the specialization feature that allows to overlapping trait-impls is still unstable. It turns out that designing specialization in a sound way is way more difficult than expected (mostly due to lifetimes).
Without it being stable, the Rust devs are very careful not to expose that feature somewhere in the standard library's public API. This doesn't mean that it isn't used at all in std! A famous example is the specialized ToString impl for str. It was introduced in this PR. As you can read in the PR's discussion, they only accepted it because it does not change the API (to_string() was already implemented for str).
However, it's different when we would add the generic impl above: it would change the API. Thus, it's not allowed in std yet.
core vs std: the traits From and Into are defined in the
core library, whereas Clone and ToOwned are defined in std. This means that we can't add a generic impl in core, because core doesn't know anything about std. But we also can't add the generic impl in std, because generic impls need to be in the same crate as the trait (it's a consequence of the orphan rules).
Thus, it would required some form of refactoring and moving around definitions (which may or may not be difficult) before able to add such a generic impl.
Note that adding
impl<'a> From<&'a String> for String { ... }
... works just fine. It doesn't require specialization and doesn't have problems with orphan rules. But of course, we wouldn't want to add a specific impl, when the generic impl would make sense.
(thanks to the lovely people on IRC for explaining stuff to me)

Since String does implement From<&str>, you can make a simple change:
fn main() {
let s: String = "Hello world!".into();
// Replace &s with s.as_str()
let st: MyStruct = MyStruct::new(s.as_str());
println!("{:?}", st);
}
All &Strings can be trivially converted into &str via as_str, which is why all APIs should prefer to use &str; it's a strict superset of accepting &String.

Should I return &Option<Foo> or Option<&Foo> from a getter?

I have a struct:
struct Foo {}
struct Boo {
foo: Option<Foo>,
}
I want to create getter for it, so user cannot modify it but can read it:
impl Boo {
pub fn get_foo(&self) -> ? { unimplemented!(); }
}
Should I return &Option<Foo> or Option<&Foo>? Are there any advantages between these two variants?
I use both variants in my program, and it became an inconvenience to mix them, so I want to choose one of them for the entire program.

Use Option<&T> instead of &Option<T>. Callers are interested in the wrapped value, not in Option.
Also, the common way to implement a getter like this is as follows:
impl Boo {
pub fn get_foo(&self) -> Option<&Foo> { self.foo.as_ref() }
}
This way you don't need to check the wrapped value in the getter. If you want to return a mutable value, use as_mut() instead.

When in doubt, pick the most flexible solution. This leaves you the most leeway in the future to change the internals of the struct without altering its API.
In this case, this means picking Option<&T>:
&Option<T> forces you to have a reference to an option,
Option<&T> only requires a reference to a T.
So, for example, in the latter case I could store a Vec<T> or a Result<T, Error> and still be able to hand out a Option<&T>. It is more flexible.
Note: this is why interfaces generally use &str instead of &String, &[T] instead of &Vec<T>, ... more flexibility!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Can I implement a trait while capturing the environment? - rust

Related

Associated type for ndarray arguments in rust

How to achieve encapsulation of struct fields without borrowing the struct as a whole

Is there any way to implement Into<&str> or From<T> for &str?

Why doesn't String implement From<&String>?

Should I return &Option<Foo> or Option<&Foo> from a getter?

Categories

Resources