Multithreading with a Vector of different functions

Multithreading with a Vector of different functions - multithreading

I am trying to run multiple functions on different threads. I wrote this (minimal) code that works
use std::thread;
fn f1(count: usize) {
for i in 0..count { /*do something*/ }
}
fn f2(count: usize) {
for i in 0..count { /*do something*/ }
}
fn run(ops: &Vec<impl Fn(usize) + Sync + Send + Copy + 'static>) {
for i in 0..ops.len() {
let func = ops[i];
thread::spawn(move || func(1000));
}
}
fn main() {
let ops = vec![f1, f1]; // putting the same function twice
run(&ops);
}
However, as soon as I send different functions
let ops = vec![f1, f2];
Compiling fails. As I understand it, I cannot hold different types in the same vector (I guess, functions have different memory requirements, though I'm not quite sure how function pointers work).
I tried browsing similar questions on SO, and I tried this solution
let ops: Vec<&dyn Fn(usize)> = vec![&f1, &f2];
and I get this error
`dyn Fn(usize)` cannot be shared between threads safely
the trait `Sync` is not implemented for `dyn Fn(usize)`
And I'm stuck as I'm struggling to understand the base issue. Do you have insights on how I should understand this problem, and any pointers on how to solve it ?
Thank you!

Just as you can add the Sync marker with impl, you can also add it with &dyn, but you may need parenthesis to disambiguate:
fn run(ops: &Vec<&'static (dyn Fn(usize) + Sync)>)
Two minor comments:
Generally, using &[…] instead of &Vec<…> is more flexible and to be preferred.
No need to do for i in 0…foos.len() in rust, for foo in foos will do fine. (You may occasionally have to use for foo in &foos or for foo in foos.iter(). If you do need the index, .enumerate() is convenient.)
If the 'static is a bother (e.g. because you want to pass in some closure that captures local variables and can't be 'static), you could either use scoped threads (example), or pass owned Fns as Vec<Box<dyn Fn(usize) + Sync>>.

Related

Is there a way to reference an anonymous type as a struct member?

Consider the following Rust code:
use std::future::Future;
use std::pin::Pin;
fn main() {
let mut v: Vec<_> = Vec::new();
for _ in 1..10 {
v.push(wrap_future(Box::pin(async {})));
}
}
fn wrap_future<T>(a: Pin<Box<dyn Future<Output=T>>>) -> impl Future<Output=T> {
async {
println!("doing stuff before awaiting");
let result=a.await;
println!("doing stuff after awaiting");
result
}
}
As you can see, the futures I'm putting into the Vec don't need to be boxed, since they are all the same type and the compiler can infer what that type is.
I would like to create a struct that has this Vec<...> type as one of its members, so that I could add a line at the end of main():
let thing = MyStruct {myvec: v};
without any additional overhead (i.e. boxing).
Type inference and impl Trait syntax aren't allowed on struct members, and since the future type returned by an async block exists entirely within the compiler and is exclusive to that exact async block, there's no way to reference it by name. It seems to me that what I want to do is impossible. Is it? If so, will it become possible in a future version of Rust?
I am aware that it would be easy to sidestep this problem by simply boxing all the futures in the Vec as I did the argument to wrap_future() but I'd prefer not to do this if I can avoid it.
I am well aware that doing this would mean that there could be only one async block in my entire codebase whose result values could possibly be added to such a Vec, and thus that there could be only one function in my entire codebase that could create values that could possibly be pushed to it. I am okay with this limitation.

Nevermind, I'm stupid. I forgot that structs could have type parameters.
struct MyStruct<F> where F: Future<Output=()> {
myvec: Vec<F>,
}

How to accept either ownerhip or mutable reference for the same method

I am making a library in Rust around a TCP protocol. And I am also trying to make it async runtime independent.
I would like to make it accept ownership of an existing connection OR accept a mutable reference to it. I.e. I want to let the caller say who should drop the connection.
I have this
pub struct ReadableStream<T: AsyncRead + AsyncWrite> {
reader: FramedRead<ReadHalf<T>, WordCodec>,
}
pub struct WriteableStream<T: AsyncRead + AsyncWrite> {
writer: FramedWrite<WriteHalf<T>, WordCodec>,
}
pub struct AsyncStream<T: AsyncRead + AsyncWrite> {
read_stream: ReadableStream<T>,
write_stream: WriteableStream<T>,
}
impl<T: AsyncRead + AsyncWrite> From<T> for AsyncStream<T> {
fn from(stream: T) -> Self {
let parts = stream.split();
Self {
read_stream: ReadableStream::new(parts.0),
write_stream: WriteableStream::new(parts.1),
}
}
}
impl<'stream, T: AsyncRead + AsyncWrite + Unpin> From<&'stream mut T>
for AsyncStream<&'stream mut T>
{
fn from(stream: &'stream mut T) -> Self {
let parts = stream.split();
Self {
read_stream: ReadableStream::new(parts.0),
write_stream: WriteableStream::new(parts.1),
}
}
}
But this fails to compile with "conflicting implementation for AsyncStream<&mut _>". If I uncomment the second implementation, the first one compiles. I can also compile only the second one... But I want to have both. I would expect either to be called based on whether from() was called like AsyncStream::from(source) or like AsyncStream::from(&mut source).
So how could I accomplish that?
Note: FramedRead/FramedWrite are from asynchronous-codec. AsyncRead, AsyncWrite, ReadHalf, WriteHalf are from the futures crate. WordCodec is my implementation of the protocol tailored to asynchronous-codec that is irrelevant to the question at hand.

I decided what I thought is biting the bullet by implementing a method in my struct ( AsyncStream ) to do the conversion, and maybe implement From for the specific runtimes I plan to test against, enabling users to use the struct's method when their runtime is not covered.
This approach works... but furthermore than that, it revealed that what I was trying to do is not exactly feasible for a different reason. I want to have the ability to split the connection into reader and writer, and I can't do that since split() takes ownership.
I'll solve the ownership/mutable reference problem by accepting ownership only, and providing a method that consumes self and returns the wrapped stream. Not exactly the ergonomics I was hoping for, but it addresses the scenario about wanting to let the user use my lib from a point, to a point, and switch from/to something else.

Sharing a variable with a closure

I know, the answers for similar questions already exist... kinda... all that I found were very high level instructions rather than code examples and I couldn't really follow them and I've been stuck in place for days now...
I need to share a variable with a closure. So I have my MockSlave struct. I want it to create a vector that can then be shared with the shell closure. The current version looks something like this
impl MockSlave {
pub async fn new(port: i32) -> Result<Self, Box<dyn std::error::Error>> {
let addr = String::from(format!("https://127.0.0.1:{}", port).as_str());
let endpoint = Endpoint::from_str(addr.as_str())?;
let commands = Rc::new(RefCell::new(Vec::new()));
let client = SlaveClient::new(endpoint, CommandProcessors {
shell: Some(|command| {
commands.borrow_mut().push(command);
Ok(())
})
}).await?;
Ok(MockSlave {
client,
commands
})
}
For clarity, here's SlaveClient::new
pub async fn new(endpoint: Endpoint, processors: CommandProcessors) -> Result<Self, tonic::transport::Error> {
let slave = SlaveClient {
client: SlaveManagerClient::connect(endpoint).await?,
processors
};
Ok(slave)
}
And here's CommandProcessors:
pub struct ReadWriteStreamingCommand {
pub command: Command
}
pub type ReadWriteSteamingCommandprocessor = fn(Box<ReadWriteStreamingCommand>) -> Result<(), Box<dyn Error>>;
pub struct CommandProcessors {
pub shell: Option<ReadWriteSteamingCommandprocessor>
}
Now, the fun part is that VSCode and cargo build give me slightly different errors about the same thing.
VSCode says:
closures can only be coerced to `fn` types if they do not capture any variables
Meanwhile, cargo build says
error[E0308]: mismatched types
--> tests/api_tests/src/mock_slave.rs:20:25
|
20 | shell: Some(|command| {
| _________________________^
21 | | commands.borrow_mut().push(command);
22 | | Ok(())
23 | | })
| |_____________^ expected fn pointer, found closure
Please help. This is at least my 10th approach to this problem over the course of a week. I am not going to move forward by myself...

fn(Box<ReadWriteStreamingCommand>) -> Result<(), Box<dyn Error>>;
This is not the type you think it is. This is the type of pointers to functions that exist statically in your code. That is, functions defined at the top-level or closures which don't actually close over any variables. To accept general Rust callables (including traditional functions and closures), you need one of Fn, FnMut, or FnOnce.
I can't tell from the small code you've shown which of the three you need, but the basic idea is this.
Use FnOnce if you are only going to call the function at most once. This is useful if you're writing some kind of "control flow" type of construct or mapping on the inside of an Option (where at most one value exists).
Use FnMut if you are going to call the function several times but never concurrently. This is what's used by most Rust built-in collection functions like map and filter, since it's always called in a single thread. This is the one I find myself using the most often for general-purpose things.
Use Fn if you need concurrent access and are planning to share across threads.
Every Fn is an FnMut (any function that can be called concurrently can be called sequentially), and every FnMut is an FnOnce (any function that can be called can obviously be called once), so you should choose the highest one on the list above that you can handle.
I'll assume you want FnMut. If you decide one of the others is better, the syntax is the same, but you'll just need to change the name. The type you want, then, is
dyn FnMut(Box<ReadWriteStreamingCommand>) -> Result<(), Box<dyn Error>>
But this is a trait object and hence is not Sized, so you can't do much with it. In particular, you definitely can't put it in another Sized data structure. Instead, we'll wrap it in a Box.
pub type ReadWriteSteamingCommandprocessor = Box<dyn FnMut(Box<ReadWriteStreamingCommand>) -> Result<(), Box<dyn Error>>>;
And then
shell: Some(Box::new(|command| {
commands.borrow_mut().push(command);
Ok(())
}))

How to store async closure created at runtime in a struct?

I'm learning Rust's async/await feature, and stuck with the following task. I would like to:
Create an async closure (or better to say async block) at runtime;
Pass created closure to constructor of some struct and store it;
Execute created closure later.
Looking through similar questions I wrote the following code:
use tokio;
use std::pin::Pin;
use std::future::Future;
struct Services {
s1: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>,
}
impl Services {
fn new(f: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>) -> Self {
Services { s1: f }
}
}
enum NumberOperation {
AddOne,
MinusOne
}
#[tokio::main]
async fn main() {
let mut input = vec![1,2,3];
let op = NumberOperation::AddOne;
let s = Services::new(Box::new(|numbers: &mut Vec<usize>| Box::pin(async move {
for n in numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
})));
(s.s1)(&mut input).await;
assert_eq!(input, vec![2,3,4]);
}
But above code won't compile, because of invalid lifetimes.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Also it's not clear why do we have to use Pin<Box> as return type?
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
Thanks
Update
There is similar question How can I store an async function in a struct and call it from a struct instance?
. Although that's a similar question and actually my example is based on one of the answers from that question. Second answer contains information about closures created at runtime, but seems it works only when I pass an owned variable, but in my example I would like to pass to closure created at runtime mutable reference, not owned variable.

How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Let's take a closer look at what happens when you invoke the closure:
(s.s1)(&mut input).await;
// ^^^^^^^^^^^^^^^^^^
// closure invocation
The closure immediately returns a future. You could assign that future to a variable and hold on to it until later:
let future = (s.s1)(&mut input);
// do some other stuff
future.await;
The problem is, because the future is boxed, it could be held around for the rest of the program's life without ever being driven to completion; that is, it could have 'static lifetime. And input must obviously remain borrowed until the future resolves: else imagine, for example, what would happen if "some other stuff" above involved modifying, moving or even dropping input—consider what would then happen when the future is run?
One solution would be to pass ownership of the Vec into the closure and then return it again from the future:
let s = Services::new(Box::new(move |mut numbers| Box::pin(async move {
for n in &mut numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
numbers
})));
let output = (s.s1)(input).await;
assert_eq!(output, vec![2,3,4]);
See it on the playground.
#kmdreko's answer shows how you can instead actually tie the lifetime of the borrow to that of the returned future.
Also it's not clear why do we have to use Pin as return type?
Let's look at a stupidly simple async block:
async {
let mut x = 123;
let r = &mut x;
some_async_fn().await;
*r += 1;
x
}
Notice that execution may pause at the await. When that happens, the incumbent values of x and r must be stored temporarily (in the Future object: it's just a struct, in this case with fields for x and r). But r is a reference to another field in the same struct! If the future were then moved from its current location to somewhere else in memory, r would still refer to the old location of x and not the new one. Undefined Behaviour. Bad bad bad.
You may have observed that the future can also hold references to things that are stored elsewhere, such as the &mut input in #kmdreko's answer; because they are borrowed, those also cannot be moved for the duration of the borrow. So why can't the immovability of the future similarly be enforced by r's borrowing of x, without pinning? Well, the future's lifetime would then depend on its content—and such circularities are impossible in Rust.
This, generally, is the problem with self-referential data structures. Rust's solution is to prevent them from being moved: that is, to "pin" them.
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
In your specific example, the closure and future can reside on the stack and you can simply get rid of all the boxing and pinning (the borrow-checker can ensure stack items don’t move without explicit pinning). However, if you want to return the Services from a function, you'll run into difficulties stating its type parameters: impl Trait would normally be your go-to solution for this type of problem, but it's limited and does not (currently) extend to associated types, such as that of the returned future.
There are work-arounds, but using boxed trait objects is often the most practical solution—albeit it introduces heap allocations and an additional layer of indirection with commensurate runtime cost. Such trait objects are however unavoidable where a single instance of your Services structure may hold different closures in s1 over the course of its life, where you're returning them from trait methods (which currently can’t use impl Trait), or where you're interfacing with a library that does not provide any alternative.

If you want your example to work as is, the missing component is communicating to the compiler what lifetime associations are allowed. Trait objects like dyn Future<...> are constrained to be 'static by default, which means it cannot have references to non-static objects. This is a problem because your closure returns a Future that needs to keep a reference to numbers in order to work.
The direct fix is to annotate that the dyn FnOnce can return a Future that can be bound to the life of the first parameter. This requires a higher-ranked trait bound and the syntax looks like for<'a>:
struct Services {
s1: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>,
}
impl Services {
fn new(f: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>) -> Self {
Services { s1: f }
}
}
The rest of your code now compiles without modification, check it out on the playground.

How to achieve encapsulation of struct fields without borrowing the struct as a whole

My question has already been somewhat discussed here.
The problem is that I want to access multiple distinct fields of a struct in order to use them, but I don't want to work on the fields directly. Instead I'd like to encapsulate the access to them, to gain more flexibility.
I tried to achieve this by writing methods for this struct, but as you can see in the question mentioned above or my older question here this approach fails in Rust because the borrow checker will only allow you to borrow different parts of a struct if you do so directly, or if you use a method borrowing them together, since all the borrow checker knows from its signature is that self is borrowed and only one mutable reference to self can exist at any time.
Losing the flexibility to borrow different parts of a struct as mutable at the same time is of course not acceptable. Therefore I'd like to know whether there is any idiomatic way to do this in Rust.
My naive approach would be to write a macro instead of a function, performing (and encapsulating) the same functionality.
EDIT:
Because Frxstrem suggested that the question which I linked to at the top may answer my question I want to make it clear, that I'm not searching for some way to solve this. My question is which of the proposed solutions (if any) is the right way to do it?

After some more research it seems that there is no beautiful way to achieve partial borrowing without accessing the struct fields directly, at least as of now.
There are, however, multiple imperfect solutions to this which I'd like to discuss a little, so that anyone who might end up here may weigh the pros and cons for themself.
I'll use the code from my original question as an example to illustrate them here.
So let's say you have a struct:
struct GraphState {
nodes: Vec<Node>,
edges: Vec<Edge>,
}
and you're trying to do borrow one part of this struct mutably and the other immutably:
// THIS CODE DOESN'T PASS THE BORROW-CHECKER
impl GraphState {
pub fn edge_at(&self, edge_index: u16) -> &Edge {
&self.edges[usize::from(edge_index)]
}
pub fn node_at_mut(&mut self, node_index: u8) -> &mut Node {
&mut self.nodes[usize::from(node_index)]
}
pub fn remove_edge(&mut self, edge_index: u16) {
let edge = self.edge_at(edge_index); // first (immutable) borrow here
// update the edge-index collection of the nodes connected by this edge
for i in 0..2 {
let node_index = edge.node_indices[i];
self.node_at_mut(node_index).remove_edge(edge_index); // second (mutable)
// borrow here -> ERROR
}
}
}
}
But of course this fails, since you're not allowed to borrow self as mutable and immutable at the same time.
So to get around this you could simply access the fields directly:
impl GraphState {
pub fn remove_edge(&mut self, edge_index: u16) {
let edge = &self.edges[usize::from(edge_index)];
for i in 0..2 {
let node_index = edge.node_indices[i];
self.nodes[usize::from(node_index)].remove_edge(edge_index);
}
}
}
This approach works, but it has two major drawbacks:
The accessed fields need to be public (at least if you want to allow access to them from another scope). If they're implementation details that you'd rather have private you're in bad luck.
You always need to operate on the fields directly. This means that code like usize::from(node_index) needs to be repeated everywhere, which makes this approach both brittle and cumbersome.
So how can you solve this?
A) Borrow everything at once
Since borrowing self mutably multiple times is not allowed, mutably borrowing all the parts you want at once instead is one straightforward way to solve this problem:
pub fn edge_at(edges: &[Edge], edge_index: u16) -> &Edge {
&edges[usize::from(edge_index)]
}
pub fn node_at_mut(nodes: &mut [Node], node_index: u8) -> &mut Node {
&mut nodes[usize::from(node_index)]
}
impl GraphState {
pub fn data_mut(&mut self) -> (&mut [Node], &mut [Edge]) {
(&mut self.nodes, &mut self.edges)
}
pub fn remove_edge(&mut self, edge_index: u16) {
let (nodes, edges) = self.data_mut(); // first (mutable) borrow here
let edge = edge_at(edges, edge_index);
// update the edge-index collection of the nodes connected by this edge
for i in 0..2 {
let node_index = edge.node_indices[i];
node_at_mut(nodes, node_index).remove_edge(edge_index); // no borrow here
// -> no error
}
}
}
}
It's clearly a workaround and far from ideal, but it works and it allows you to keep the fields themselves private (though you'll probably need to expose the implementation somewhat, as the user has to hand the necessary data to the other functions by hand).
B) Use a macro
If code reuse is all you worry about and visibility is not an issue to you you can write macros like these:
macro_rules! node_at_mut {
($this:ident, $index:expr) => {
&mut self.nodes[usize::from($index)]
}
}
macro_rules! edge_at {
($this:ident, $index:expr) => {
&mut self.edges[usize::from($index)]
}
}
...
pub fn remove_edge(&mut self, edge_index: u16) {
let edge = edge_at!(self, edge_index);
// update the edge-index collection of the nodes connected by this edge
for i in 0..2 {
let node_index = edge.node_indices[i];
node_at_mut!(self, node_index).remove_edge(edge_index);
}
}
}
If your fields are public anyway I'd probably go with this solution, as it seems the most elegant to me.
C) Go unsafe
What we want to do here is obviously safe.
Sadly the borrow checker cannot see this, since the function signatures tell him no more than that self is being borrowed. Luckily Rust allows us to use the unsafe keyword for cases like these:
pub fn remove_edge(&mut self, edge_index: u16) {
let edge: *const Edge = self.edge_at(edge_index);
for i in 0..2 {
unsafe {
let node_index = (*edge).node_indices[i];
self.node_at_mut(node_index).remove_edge(edge_index); // first borrow here
// (safe though since remove_edge will not invalidate the first pointer)
}
}
}
This works and gives us all the benefits of solution A), but using unsafe for something that could easily be done without, if only the language had some syntax for actual partial borrowing, seems a bit ugly to me. On the other hand it may (in some cases) be preferable to solution A), due to its sheer clunkyness...
EDIT: After some thought I realised, that we know this approach to be safe here, only because we know about the implementation. In a different use case the data in question might actually be contained in one single field (say a Map), even if it looks like we are accessing two very distinct kinds of data when calling from outside.
This is why this last approach is rightfully unsafe, as there is no way for the borrow checker to check whether we are actually borrowing different things without exposing the private fields, rendering our effort pointless.
Contrary to my initial belief, this couldn't even really be changed by expanding the language. The reason is that one would still need to expose information about private fields in some way for this to work.
After writing this answer I also found this blog post which goes a bit deeper into possible solutions (and also mentions some advanced techniques that didn't come to my mind, none of which is universally applicable though). If you happen to know another solution, or a way to improve the ones proposed here, please let me know.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Multithreading with a Vector of different functions - multithreading

Related

Is there a way to reference an anonymous type as a struct member?

How to accept either ownerhip or mutable reference for the same method

Sharing a variable with a closure

How to store async closure created at runtime in a struct?

How to achieve encapsulation of struct fields without borrowing the struct as a whole

Categories

Resources