Modifying an Arc<Mutex<_>> from two tokio tasks - rust

I have an Arc<Mutex<Vec<i32>>>, my_data that holds some integers. I have a task that pushes integers (retrieved from an unbounded channel) and another task that gets the last integer from my_data, uses it and pushes a new integer back into my_data (and sends it over to a new channel).
The problem I'm having is: used of moved value: my_data.
I cannot remove the move from when spawning a new task because of lifetimes. Each task needs to have ownership of the data.
The application will then spawn another task, obtain the lock of my_data in order to gain access to it, dump it to local disk and .drain() it after that.
use tokio::sync::Mutex;
use tokio::sync::mpsc;
use std::sync::Arc;
#[tokio::main]
async fn main() {
let my_data: Vec<i32> = vec![];
let my_data = Arc::new(Mutex::new(my_data));
let (tx1, mut rx1) = mpsc::unbounded_channel();
tokio::spawn(async move {
tx1.send(10).unwrap();
}).await.unwrap();
let handle1 = tokio::spawn(async move {
while let Some(x) = rx1.recv().await {
let my_data_clone = Arc::clone(&my_data);
let mut my_data_lock = my_data_clone.lock().await;
my_data_lock.push(x);
}
});
let (tx2, mut rx2) = mpsc::unbounded_channel();
tokio::spawn(async move {
tx2.send(3).unwrap();
}).await.unwrap();
let handle2 = tokio::spawn(async move {
while let Some(x) = rx2.recv().await {
let my_data_clone = Arc::clone(&my_data); //
let mut my_data_lock = my_data_clone.lock().await;
// Want: get last value, do something to it, then push it
let last_value = my_data_lock.last(); // 10
let value_to_push = x * last_value.unwrap(); // 3 * 10
my_data_lock.push(value_to_push); // my_data: [10, 30]
}
});
let (_, _) = tokio::join!(handle1, handle2);
}

Reason: Ownership of my_data is moved inside of the first task (handle1) before you produce a clone for the second task. All of your current Arc::clone calls are useless.
Solution: Easiest way to solve is to produce a usable clone of the data before the first task starts executing. Code below only has one Arc::clone call and it's in the main function.
...
// ADD THIS LINE:
let my_data_clone = Arc::clone(&my_data);
let handle1 = tokio::spawn(async move {
while let Some(x) = rx1.recv().await {
// REMOVE THIS: let my_data_clone = Arc::clone(&my_data);
let mut my_data_lock = my_data.lock().await;
my_data_lock.push(x);
}
});
...
let handle2 = tokio::spawn(async move {
while let Some(x) = rx2.recv().await {
// REMOVE THIS: let my_data_clone = Arc::clone(&my_data);
let mut my_data_lock = my_data_clone.lock().await;
let last_value = my_data_lock.last(); // 10
let value_to_push = x * last_value.unwrap(); // 3 * 10
my_data_lock.push(value_to_push); // my_data: [10, 30]
}
});
...

Related

"there is no signal driver running, must be called from the context of Tokio runtime" despite running in Runtime::block_on

I'm writing a commandline application using Tokio that controls its lifecycle by listening for keyboard interrupt events (i.e. ctrl + c); however, at the same time it must also monitor the other tasks that are spawned and potentially initiate an early shutdown if any of the tasks panic or otherwise encounter an error. To do this, I have wrapped tokio::select in a while loop that terminates once the application has at least had a chance to safely shut down.
However, as soon as the select block polls the future returned by tokio::signal::ctrl_c, the main thread panics with the following message:
thread 'main' panicked at 'there is no signal driver running, must be called from the context of Tokio runtime'
...which is confusing, because this is all done inside a Runtime::block_on call. I haven't published this application (yet), but the problem can be reproduced with the following code:
use tokio::runtime::Builder;
use tokio::signal;
use tokio::sync::watch;
use tokio::task::JoinSet;
fn main() {
let runtime = Builder::new_multi_thread().worker_threads(2).build().unwrap();
runtime.block_on(async {
let _rt_guard = runtime.enter();
let (ping_tx, mut ping_rx) = watch::channel(0u32);
let (pong_tx, mut pong_rx) = watch::channel(0u32);
let mut tasks = JoinSet::new();
let ping = tasks.spawn(async move {
let mut val = 0u32;
ping_tx.send(val).unwrap();
while val < 10u32 {
pong_rx.changed().await.unwrap();
val = *pong_rx.borrow();
ping_tx.send(val + 1).unwrap();
println!("ping! {}", val + 1);
}
});
let pong = tasks.spawn(async move {
let mut val = 0u32;
while val < 10u32 {
ping_rx.changed().await.unwrap();
val = *ping_rx.borrow();
pong_tx.send(val + 1).unwrap();
println!("pong! {}", val + 1);
}
});
let mut interrupt = Box::pin(signal::ctrl_c());
let mut interrupt_read = false;
while !interrupt_read && !tasks.is_empty() {
tokio::select! {
biased;
_ = &mut interrupt, if !interrupt_read => {
ping.abort();
pong.abort();
interrupt_read = true;
},
_ = tasks.join_next() => {}
}
}
});
}
Rust Playground
This example is a bit contrived, but the important parts are:
I am intentionally using Runtime::block_on() instead of tokio::main as I want to control the number of runtime threads at runtime.
Although, curiously, this example works if rewritten to use tokio::main.
I added let _rt_guard = runtime.enter() to ensure that the runtime context was set, but its presence or absence don't appear to make a difference.
I discovered the answer to this while I was finishing writing up the question, but since I couldn't find the answer by searching on the error message I'll share the answer here.
As I noted, the example I gave works if run from within a function annotated with tokio::main. Looking at the docs, the main macro expands to:
fn main() {
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap()
.block_on(async {
println!("Hello world");
})
}
The example I gave did not call enable_all() on the builder (or, more specifically, enable_io()). Adding this call to the builder chain stops the panic from occurring. The runtime.enter() call I made can also be removed, since it wasn't really doing anything.
Updated example (one of the threads still panics due to a Receiver handle being dropped, but the ctrl_c await no longer panics, which is the key):
use tokio::runtime::Builder;
use tokio::signal;
use tokio::sync::watch;
use tokio::task::JoinSet;
fn main() {
Builder::new_multi_thread()
.worker_threads(2)
.enable_io()
.build()
.unwrap()
.block_on(async {
let (ping_tx, mut ping_rx) = watch::channel(0u32);
let (pong_tx, mut pong_rx) = watch::channel(0u32);
let mut tasks = JoinSet::new();
let ping = tasks.spawn(async move {
let mut val = 0u32;
ping_tx.send(val).unwrap();
while val < 10u32 {
pong_rx.changed().await.unwrap();
val = *pong_rx.borrow();
ping_tx.send(val + 1).unwrap();
println!("ping! {}", val + 1);
}
});
let pong = tasks.spawn(async move {
let mut val = 0u32;
while val < 10u32 {
ping_rx.changed().await.unwrap();
val = *ping_rx.borrow();
pong_tx.send(val + 1).unwrap();
println!("pong! {}", val + 1);
}
});
let mut interrupt = Box::pin(signal::ctrl_c());
let mut interrupt_read = false;
while !interrupt_read && !tasks.is_empty() {
tokio::select! {
biased;
_ = &mut interrupt, if !interrupt_read => {
ping.abort();
pong.abort();
interrupt_read = true;
},
_ = tasks.join_next() => {}
}
}
});
}
Rust Playground

How do I avoid obfuscating logic in a `loop`?

Trying to respect Rust safety rules leads me to write code that is, in this case, less clear than the alternative.
It's marginal, but must be a very common pattern, so I wonder if there's any better way.
The following example doesn't compile:
async fn query_all_items() -> Vec<u32> {
let mut items = vec![];
let limit = 10;
loop {
let response = getResponse().await;
// response is moved here
items.extend(response);
// can't do this, response is moved above
if response.len() < limit {
break;
}
}
items
}
In order to satisfy Rust safety rules, we can pre-compute the break condition:
async fn query_all_items() -> Vec<u32> {
let mut items = vec![];
let limit = 10;
loop {
let response = getResponse().await;
let should_break = response.len() < limit;
// response is moved here
items.extend(response);
// meh
if should_break {
break;
}
}
items
}
Is there any other way?
I agree with Daniel's point that this should be a while rather than a loop, though I'd move the logic to the while rather than creating a boolean:
let mut len = limit;
while len >= limit {
let response = queryItems(limit).await?;
len = response.len();
items.extend(response);
}
Not that you should do this, but an async stream version is possible. However a plain old loop is much easier to read.
use futures::{future, stream, StreamExt}; // 0.3.19
use rand::{
distributions::{Distribution, Uniform},
rngs::ThreadRng,
};
use std::sync::{Arc, Mutex};
use tokio; // 1.15.0
async fn get_response(rng: Arc<Mutex<ThreadRng>>) -> Vec<u32> {
let mut rng = rng.lock().unwrap();
let range = Uniform::from(0..100);
let len_u32 = range.sample(&mut *rng);
let len_usize = usize::try_from(len_u32).unwrap();
vec![len_u32; len_usize]
}
async fn query_all_items() -> Vec<u32> {
let rng = Arc::new(Mutex::new(ThreadRng::default()));
stream::iter(0..)
.then(|_| async { get_response(Arc::clone(&rng)).await })
.take_while(|v| future::ready(v.len() >= 10))
.collect::<Vec<_>>()
.await
.into_iter()
.flatten()
.collect()
}
#[tokio::main]
async fn main() {
// [46, 46, 46, ..., 78, 78, 78], or whatever random list you get
println!("{:?}", query_all_items().await);
}
I would do this in a while loop since the while will surface the flag more easily.
fn query_all_items () -> Vec<Item> {
let items = vec![];
let limit = 10;
let mut limit_reached = false;
while limit_reached {
let response = queryItems(limit).await?;
limit_reached = response.len() >= limit;
items.extend(response);
}
items
}
Without context it's hard to advise ideal code. I would do:
fn my_body_is_ready() -> Vec<u32> {
let mut acc = vec![];
let min = 10;
loop {
let foo = vec![42];
if foo.len() < min {
acc.extend(foo);
break acc;
} else {
acc.extend(foo);
}
}
}

Sending messages between independent threads in Rust

I understand this simple example (taken from the Rust book):
use std::sync::mpsc;
use std::thread;
fn main() {
let (tx, rx) = mpsc::channel();
thread::spawn(move || {
let val = String::from("hi");
tx.send(val).unwrap();
});
let received = rx.recv().unwrap();
println!("Got: {}", received);
}
The channel's sender is moved to the newly spawned thread and this new thread can then use the sender to send a message via the channel. Then the original thread can receive from the channel and get the message.
Edited after feedback from #Stargateur:
But what about a more complicated example where there are many threads. How do each of the threads get the sender to be able to message all of the other threads? Are all of the channels meant to be created for all of the threads of the program during the program startup in main()? So if (for example) there is 50 threads in the program there are meant to be 50 channels created in main()? And then all 50 threads would each have a clone of the 50 senders?
For example, if there were only 4 threads is it meant to be done like this?
use std::sync::mpsc;
use std::thread;
fn main() {
let (sender1, receiver1) = mpsc::channel();
let (sender2, receiver2) = mpsc::channel();
let (sender3, receiver3) = mpsc::channel();
let (sender4, receiver4) = mpsc::channel();
let builder1 = thread::Builder::new().name("thread1".into());
builder1.spawn(move || {
loop {
// Am I meant to make a copy of every sender here so I can message whichever thread I need to?
let sender1 = sender1.clone();
let sender2 = sender2.clone();
let sender3 = sender3.clone();
let sender4 = sender4.clone();
let val = String::from("hi");
sender2.send(val).unwrap();
if let Ok(received) = receiver1.try_recv() {
println!("Got: {}", received);
}
}
});
let builder2 = thread::Builder::new().name("thread2".into());
builder2.spawn(move || {
loop {
// Am I meant to make a copy of every sender here so I can message whichever thread I need to?
let sender1 = sender1.clone();
let sender2 = sender2.clone();
let sender3 = sender3.clone();
let sender4 = sender4.clone();
let val = String::from("hi");
sender1.send(val).unwrap();
if let Ok(received) = receiver2.try_recv() {
println!("Got: {}", received);
}
}
});
let builder3 = thread::Builder::new().name("thread3".into());
builder3.spawn(move || {
loop {
// Am I meant to make a copy of every sender here so I can message whichever thread I need to?
let sender1 = sender1.clone();
let sender2 = sender2.clone();
let sender3 = sender3.clone();
let sender4 = sender4.clone();
let val = String::from("hi");
sender2.send(val).unwrap();
if let Ok(received) = receiver3.try_recv() {
println!("Got: {}", received);
}
}
});
let builder4 = thread::Builder::new().name("thread4".into());
builder4.spawn(move || {
loop {
// Am I meant to make a copy of every sender here so I can message whichever thread I need to?
let sender1 = sender1.clone();
let sender2 = sender2.clone();
let sender3 = sender3.clone();
let sender4 = sender4.clone();
let val = String::from("hi");
sender2.send(val).unwrap();
if let Ok(received) = receiver4.try_recv() {
println!("Got: {}", received);
}
}
});
loop {}
}
So far to achieve something like this I've been putting the sender of the second thread into a lazy static. Then the first thread can get the sender out of the lazy static and send the messages to the second thread.

How can I use threads to run this code simultaneously in rust?

I have a rust program that creates temporary email addresses using the mail.tm API, and I want to use threads to create emails simultaneously, to increase the speed. However, what I have tried, only results in printing "Getting email.." x amount of times, and exiting. I am unsure what to do about this. Any help or suggestions are appreciated.
use json;
use rand::distributions::Alphanumeric;
use rand::{thread_rng, Rng};
use reqwest;
use reqwest::header::{HeaderMap, HeaderValue, ACCEPT, CONTENT_TYPE};
use std::{collections::HashMap, io, iter, vec::Vec};
use std::thread;
fn gen_address() -> Vec<String> {
let mut rng = thread_rng();
let address: String = iter::repeat(())
.map(|()| rng.sample(Alphanumeric))
.map(char::from)
.take(10)
.collect();
let password: String = iter::repeat(())
.map(|()| rng.sample(Alphanumeric))
.map(char::from)
.take(5)
.collect();
let body = reqwest::blocking::get("https://api.mail.tm/domains")
.unwrap()
.text()
.unwrap();
let domains = json::parse(&body).expect("Failed to parse domain json.");
let domain = domains["hydra:member"][0]["domain"].to_string();
let email = format!("{}#{}", &address, &domain);
vec![email, password]
}
fn gen_email() -> Vec<String> {
let client = reqwest::blocking::Client::new();
let address_info = gen_address();
let address = &address_info[0];
let password = &address_info[1];
let mut data = HashMap::new();
data.insert("address", &address);
data.insert("password", &password);
let mut headers = HeaderMap::new();
headers.insert(ACCEPT, HeaderValue::from_static("application/ld+json"));
headers.insert(
CONTENT_TYPE,
HeaderValue::from_static("application/ld+json"),
);
let res = client
.post("https://api.mail.tm/accounts")
.headers(headers)
.json(&data)
.send()
.unwrap();
vec![
res.status().to_string(),
address.to_string(),
password.to_string(),
]
}
fn main() {
fn get_amount() -> i32 {
let mut amount = String::new();
loop {
println!("How many emails do you want?");
io::stdin()
.read_line(&mut amount)
.expect("Failed to read line.");
let _amount: i32 = match amount.trim().parse() {
Ok(num) => return num,
Err(_) => {
println!("Please enter a number.");
continue;
}
};
}
}
let amount = get_amount();
let handle = thread::spawn(move || {
for _gen in 0..amount {
let handle = thread::spawn(|| {
println!("Getting email...");
let maildata = gen_email();
println!(
"Status: {}, Address: {}, Password: {}",
maildata[0], maildata[1], maildata[2]);
});
}
});
handle.join().unwrap();
}
Rust Playground example
I see a number of sub-threads being spawned from an outer thread. I think you might want to keep those handles and join them. Unless you join those sub threads the outer thread will exit early. I set up a Rust Playground to demonstrate ^^.
In the playground example, first run the code as-is and note the output of the code - the function it's running is not_joining_subthreads(). Note that it terminates rather abruptly. Then modify the code to call joining_subthreads(). You should then see the subthreads printing out their stdout messages.
let handle = thread::spawn(move || {
let mut handles = vec![];
for _gen in 0..amount {
let handle = thread::spawn(|| {
println!("Getting email...");
let maildata = gen_email();
println!(
"Status: {}, Address: {}, Password: {}",
maildata[0], maildata[1], maildata[2]);
});
handles.push(handle);
}
handles.into_iter().for_each(|h| h.join().unwrap());
});
handle.join().unwrap();

Finding a way to solve "...does not live long enough"

I'm building a multiplex in rust. It's one of my first applications and a great learning experience!
However, I'm facing a problem and I cannot find out how to solve it in rust:
Whenever a new channel is added to the multiplex, I have to listen for data on this channel.
The new channel is allocated on the stack when it is requested by the open() function.
However, this channel must not be allocated on the stack but on the heap somehow, because it should stay alive and should not be freed in the next iteration of my receiving loop.
Right now my code looks like this (v0.10-pre):
extern crate collections;
extern crate sync;
use std::comm::{Chan, Port, Select};
use std::mem::size_of_val;
use std::io::ChanWriter;
use std::io::{ChanWriter, PortReader};
use collections::hashmap::HashMap;
use sync::{rendezvous, SyncPort, SyncChan};
use std::task::try;
use std::rc::Rc;
struct MultiplexStream {
internal_port: Port<(u32, Option<(Port<~[u8]>, Chan<~[u8]>)>)>,
internal_chan: Chan<u32>
}
impl MultiplexStream {
fn new(downstream: (Port<~[u8]>, Chan<~[u8]>)) -> ~MultiplexStream {
let (downstream_port, downstream_chan) = downstream;
let (p1, c1): (Port<u32>, Chan<u32>) = Chan::new();
let (p2, c2):
(Port<(u32, Option<(Port<~[u8]>, Chan<~[u8]>)>)>,
Chan<(u32, Option<(Port<~[u8]>, Chan<~[u8]>)>)>) = Chan::new();
let mux = ~MultiplexStream {
internal_port: p2,
internal_chan: c1
};
spawn(proc() {
let mut pool = Select::new();
let mut by_port_num = HashMap::new();
let mut by_handle_id = HashMap::new();
let mut handle_id2port_num = HashMap::new();
let mut internal_handle = pool.handle(&p1);
let mut downstream_handle = pool.handle(&downstream_port);
unsafe {
internal_handle.add();
downstream_handle.add();
}
loop {
let handle_id = pool.wait();
if handle_id == internal_handle.id() {
// setup new port
let port_num: u32 = p1.recv();
if by_port_num.contains_key(&port_num) {
c2.send((port_num, None))
}
else {
let (p1_,c1_): (Port<~[u8]>, Chan<~[u8]>) = Chan::new();
let (p2_,c2_): (Port<~[u8]>, Chan<~[u8]>) = Chan::new();
/********************************/
let mut h = pool.handle(&p1_); // <--
/********************************/
/* the error is HERE ^^^ */
/********************************/
unsafe { h.add() };
by_port_num.insert(port_num, c2_);
handle_id2port_num.insert(h.id(), port_num);
by_handle_id.insert(h.id(), h);
c2.send((port_num, Some((p2_,c1_))));
}
}
else if handle_id == downstream_handle.id() {
// demultiplex
let res = try(proc() {
let mut reader = PortReader::new(downstream_port);
let port_num = reader.read_le_u32().unwrap();
let data = reader.read_to_end().unwrap();
return (port_num, data);
});
if res.is_ok() {
let (port_num, data) = res.unwrap();
by_port_num.get(&port_num).send(data);
}
else {
// TODO: handle error
}
}
else {
// multiplex
let h = by_handle_id.get_mut(&handle_id);
let port_num = handle_id2port_num.get(&handle_id);
let port_num = *port_num;
let data = h.recv();
try(proc() {
let mut writer = ChanWriter::new(downstream_chan);
writer.write_le_u32(port_num);
writer.write(data);
writer.flush();
});
// todo check if chan was closed
}
}
});
return mux;
}
fn open(self, port_num: u32) -> Result<(Port<~[u8]>, Chan<~[u8]>), ()> {
let res = try(proc() {
self.internal_chan.send(port_num);
let (n, res) = self.internal_port.recv();
assert!(n == port_num);
return res;
});
if res.is_err() {
return Err(());
}
let res = res.unwrap();
if res.is_none() {
return Err(());
}
let (p,c) = res.unwrap();
return Ok((p,c));
}
}
And the compiler raises this error:
multiplex_stream.rs:81:31: 81:35 error: `p1_` does not live long enough
multiplex_stream.rs:81 let mut h = pool.handle(&p1_);
^~~~
multiplex_stream.rs:48:16: 122:4 note: reference must be valid for the block at 48:15...
multiplex_stream.rs:48 spawn(proc() {
multiplex_stream.rs:49 let mut pool = Select::new();
multiplex_stream.rs:50 let mut by_port_num = HashMap::new();
multiplex_stream.rs:51 let mut by_handle_id = HashMap::new();
multiplex_stream.rs:52 let mut handle_id2port_num = HashMap::new();
multiplex_stream.rs:53
...
multiplex_stream.rs:77:11: 87:7 note: ...but borrowed value is only valid for the block at 77:10
multiplex_stream.rs:77 else {
multiplex_stream.rs:78 let (p1_,c1_): (Port<~[u8]>, Chan<~[u8]>) = Chan::new();
multiplex_stream.rs:79 let (p2_,c2_): (Port<~[u8]>, Chan<~[u8]>) = Chan::new();
multiplex_stream.rs:80
multiplex_stream.rs:81 let mut h = pool.handle(&p1_);
multiplex_stream.rs:82 unsafe { h.add() };
Does anyone have an idea how to solve this issue?
The problem is that the new channel that you create does not live long enough—its scope is that of the else block only. You need to ensure that it will live longer—its scope must be at least that of pool.
I haven't made the effort to understand precisely what your code is doing, but what I would expect to be the simplest way to ensure the lifetime of the ports is long enough is to place it into a vector at the same scope as pool, e.g. let ports = ~[];, inserting it with ports.push(p1_); and then taking the reference as &ports[ports.len() - 1]. Sorry, that won't cut it—you can't add new items to a vector while references to its elements are active. You'll need to restructure things somewhat if you want that appraoch to work.

Resources