How to create a ring communication between threads using mpsc channels? - multithreading

I want to spawn n threads with the ability to communicate to other threads in a ring topology, e.g. thread 0 can send messages to thread 1, thread 1 to thread 2, etc. and thread n to thread 0.
This is an example of what I want to achieve with n=3:
use std::sync::mpsc::{self, Receiver, Sender};
use std::thread;
let (tx0, rx0): (Sender<i32>, Receiver<i32>) = mpsc::channel();
let (tx1, rx1): (Sender<i32>, Receiver<i32>) = mpsc::channel();
let (tx2, rx2): (Sender<i32>, Receiver<i32>) = mpsc::channel();
let child0 = thread::spawn(move || {
tx0.send(0).unwrap();
println!("thread 0 sent: 0");
println!("thread 0 recv: {:?}", rx2.recv().unwrap());
});
let child1 = thread::spawn(move || {
tx1.send(1).unwrap();
println!("thread 1 sent: 1");
println!("thread 1 recv: {:?}", rx0.recv().unwrap());
});
let child2 = thread::spawn(move || {
tx2.send(2).unwrap();
println!("thread 2 sent: 2");
println!("thread 2 recv: {:?}", rx1.recv().unwrap());
});
child0.join();
child1.join();
child2.join();
Here I create channels in a loop, store them in a vector, reorder the senders, store them in a new vector and then spawn threads each with their own Sender-Receiver (tx1/rx0, tx2/rx1, etc.) pair.
const NTHREADS: usize = 8;
// create n channels
let channels: Vec<(Sender<i32>, Receiver<i32>)> =
(0..NTHREADS).into_iter().map(|_| mpsc::channel()).collect();
// switch tupel entries for the senders to create ring topology
let mut channels_ring: Vec<(Sender<i32>, Receiver<i32>)> = (0..NTHREADS)
.into_iter()
.map(|i| {
(
channels[if i < channels.len() - 1 { i + 1 } else { 0 }].0,
channels[i].1,
)
})
.collect();
let mut children = Vec::new();
for i in 0..NTHREADS {
let (tx, rx) = channels_ring.remove(i);
let child = thread::spawn(move || {
tx.send(i).unwrap();
println!("thread {} sent: {}", i, i);
println!("thread {} recv: {:?}", i, rx.recv().unwrap());
});
children.push(child);
}
for child in children {
let _ = child.join();
}
This doesn't work, because Sender cannot be copied to create a new vector.
However, if I use refs (& Sender):
let mut channels_ring: Vec<(&Sender<i32>, Receiver<i32>)> = (0..NTHREADS)
.into_iter()
.map(|i| {
(
&channels[if i < channels.len() - 1 { i + 1 } else { 0 }].0,
channels[i].1,
)
})
.collect();
I cannot spawn the threads, because std::sync::mpsc::Sender<i32> cannot be shared between threads safely.

Senders and Receivers cannot be shared so you need to move them into their respective threads. That means removing them from the Vec or else consuming the Vec while iterating it - the vector is not permitted to be in an invalid state (with holes), even as an intermediate step. Iterating over the vectors with into_iter will achieve that by consuming them.
A little trick you can use to get the the senders and receivers to pair up in a cycle, is to create two vectors; one of senders and one of receivers; and then rotate one so that the same index into each vector will give you the pairs you want.
use std::sync::mpsc::{self, Receiver, Sender};
use std::thread;
fn main() {
const NTHREADS: usize = 8;
// create n channels
let (mut senders, receivers): (Vec<Sender<i32>>, Vec<Receiver<i32>>) =
(0..NTHREADS).into_iter().map(|_| mpsc::channel()).unzip();
// move the first sender to the back
senders.rotate_left(1);
let children: Vec<_> = senders
.into_iter()
.zip(receivers.into_iter())
.enumerate()
.map(|(i, (tx, rx))| {
thread::spawn(move || {
tx.send(i as i32).unwrap();
println!("thread {} sent: {}", i, i);
println!("thread {} recv: {:?}", i, rx.recv().unwrap());
})
})
.collect();
for child in children {
let _ = child.join();
}
}

This doesn't work, because Sender cannot be copied to create a new vector. However, if I use refs (& Sender):
While it's true that Sender cannot be copied, it does implement Clone, so you can always clone it manually. But that approach won't work for Receiver, which is not Clone and which you also need to extract from the vector.
The problem with your first code is that you cannot use let foo = vec[i] to move just one value out of a vector of non-Copy values. That would leave the vector in an invalid state, with one element invalid, subsequent access to which would cause undefined behavior. For this to work, Vec would need to track which elements were moved and which not, which would impose a cost on all Vecs. So instead, Vec disallows moving an element out of it, leaving it to the user to track moves.
A simple way to move a value out of Vec is to replace Vec<T> with Vec<Option<T>> and use Option::take. foo = vec[i] is replaced with foo = vec[i].take().unwrap(), which moves the T value from the option in vec[i] (while asserting that it's not None) and leaves None, a valid variant of Option<T>, in the vector. Here is your first attempt modified in that manner (playground):
const NTHREADS: usize = 8;
let channels_ring: Vec<_> = {
let mut channels: Vec<_> = (0..NTHREADS)
.into_iter()
.map(|_| {
let (tx, rx) = mpsc::channel();
(Some(tx), Some(rx))
})
.collect();
(0..NTHREADS)
.into_iter()
.map(|rxpos| {
let txpos = if rxpos < NTHREADS - 1 { rxpos + 1 } else { 0 };
(
channels[txpos].0.take().unwrap(),
channels[rxpos].1.take().unwrap(),
)
})
.collect()
};
let children: Vec<_> = channels_ring
.into_iter()
.enumerate()
.map(|(i, (tx, rx))| {
thread::spawn(move || {
tx.send(i as i32).unwrap();
println!("thread {} sent: {}", i, i);
println!("thread {} recv: {:?}", i, rx.recv().unwrap());
})
})
.collect();
for child in children {
child.join().unwrap();
}

Related

How to pass a value over a channel without borrow checking issues?

In the following code, I understand why I'm not allowed to do this(I think), but I'm not sure what to do to fix the issue. I'm simply trying to perform an action based upon an incoming message on a UDPSocket. However, by sending the reference to the slice over the channel, I get a problem where the buffer doesn't live long enough. I'm hoping for some suggestions because I don't know enough about Rust to move forward.
fn main() -> std::io::Result<()> {
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
match rx.try_recv() {
Ok(msg) => {
match msg {
"begin" => // run an operation
"end" | _ => // kill the previous operation
}
}
Err = { //Error Handling }
}
}
// start listener
let socket: UdpSocket = UdpSocket::bind("0.0.0.0:9001")?;
loop {
let mut buffer = [0; 100];
let (length, src_address) = socket.recv_from(&mut buffer)?;
println!("Received message of {} bytes from {}", length, src_address);
let cmd= str::from_utf8(&buffer[0..length]).unwrap(); // <- buffer does not live long enough
println!("Command: {}", cmd);
tx.send(cmd).expect("unable to send message to channel"); // Error goes away if I remove this.
}
}
Generally you should avoid sending non-owned values over a channel since its unlikely that a lifetime would be valid for both the sender and receiver (its possible to do, but you'd have to plan for it).
In this situation, you're trying to share pass &str across the channel but since it just references buffer which isn't guaranteed to exist whenever rx receives it, you get a borrow checking error. You would probably want to convert the &str into an owned String and pass that over the channel:
use std::net::UdpSocket;
use std::sync::mpsc;
fn main() {
let (tx, rx) = mpsc::channel();
std::thread::spawn(move || loop {
match rx.recv().as_deref() {
Ok("begin") => { /* run an operation */ }
Ok("end") => { /* kill the previous operation */ }
Ok(_) => { /* unknown */ }
Err(_) => { break; }
}
});
let socket = UdpSocket::bind("0.0.0.0:9001").unwrap();
loop {
let mut buffer = [0; 100];
let (length, src_address) = socket.recv_from(&mut buffer).unwrap();
let cmd = std::str::from_utf8(&buffer[0..length]).unwrap();
tx.send(cmd.to_owned()).unwrap();
}
}
As proposed in the comments, you can avoid allocating a string if you parse the value into a known value for an enum and send that across the channel instead:
use std::net::UdpSocket;
use std::sync::mpsc;
enum Command {
Begin,
End,
}
fn main() {
let (tx, rx) = mpsc::channel();
std::thread::spawn(move || loop {
match rx.recv() {
Ok(Command::Begin) => { /* run an operation */ }
Ok(Command::End) => { /* kill the previous operation */ }
Err(_) => { break; }
}
});
let socket = UdpSocket::bind("0.0.0.0:9001").unwrap();
loop {
let mut buffer = [0; 100];
let (length, src_address) = socket.recv_from(&mut buffer).unwrap();
let cmd = std::str::from_utf8(&buffer[0..length]).unwrap();
let cmd = match cmd {
"begin" => Command::Begin,
"end" => Command::End,
_ => panic!("unknown command")
};
tx.send(cmd).unwrap();
}
}

How can I use threads to run this code simultaneously in rust?

I have a rust program that creates temporary email addresses using the mail.tm API, and I want to use threads to create emails simultaneously, to increase the speed. However, what I have tried, only results in printing "Getting email.." x amount of times, and exiting. I am unsure what to do about this. Any help or suggestions are appreciated.
use json;
use rand::distributions::Alphanumeric;
use rand::{thread_rng, Rng};
use reqwest;
use reqwest::header::{HeaderMap, HeaderValue, ACCEPT, CONTENT_TYPE};
use std::{collections::HashMap, io, iter, vec::Vec};
use std::thread;
fn gen_address() -> Vec<String> {
let mut rng = thread_rng();
let address: String = iter::repeat(())
.map(|()| rng.sample(Alphanumeric))
.map(char::from)
.take(10)
.collect();
let password: String = iter::repeat(())
.map(|()| rng.sample(Alphanumeric))
.map(char::from)
.take(5)
.collect();
let body = reqwest::blocking::get("https://api.mail.tm/domains")
.unwrap()
.text()
.unwrap();
let domains = json::parse(&body).expect("Failed to parse domain json.");
let domain = domains["hydra:member"][0]["domain"].to_string();
let email = format!("{}#{}", &address, &domain);
vec![email, password]
}
fn gen_email() -> Vec<String> {
let client = reqwest::blocking::Client::new();
let address_info = gen_address();
let address = &address_info[0];
let password = &address_info[1];
let mut data = HashMap::new();
data.insert("address", &address);
data.insert("password", &password);
let mut headers = HeaderMap::new();
headers.insert(ACCEPT, HeaderValue::from_static("application/ld+json"));
headers.insert(
CONTENT_TYPE,
HeaderValue::from_static("application/ld+json"),
);
let res = client
.post("https://api.mail.tm/accounts")
.headers(headers)
.json(&data)
.send()
.unwrap();
vec![
res.status().to_string(),
address.to_string(),
password.to_string(),
]
}
fn main() {
fn get_amount() -> i32 {
let mut amount = String::new();
loop {
println!("How many emails do you want?");
io::stdin()
.read_line(&mut amount)
.expect("Failed to read line.");
let _amount: i32 = match amount.trim().parse() {
Ok(num) => return num,
Err(_) => {
println!("Please enter a number.");
continue;
}
};
}
}
let amount = get_amount();
let handle = thread::spawn(move || {
for _gen in 0..amount {
let handle = thread::spawn(|| {
println!("Getting email...");
let maildata = gen_email();
println!(
"Status: {}, Address: {}, Password: {}",
maildata[0], maildata[1], maildata[2]);
});
}
});
handle.join().unwrap();
}
Rust Playground example
I see a number of sub-threads being spawned from an outer thread. I think you might want to keep those handles and join them. Unless you join those sub threads the outer thread will exit early. I set up a Rust Playground to demonstrate ^^.
In the playground example, first run the code as-is and note the output of the code - the function it's running is not_joining_subthreads(). Note that it terminates rather abruptly. Then modify the code to call joining_subthreads(). You should then see the subthreads printing out their stdout messages.
let handle = thread::spawn(move || {
let mut handles = vec![];
for _gen in 0..amount {
let handle = thread::spawn(|| {
println!("Getting email...");
let maildata = gen_email();
println!(
"Status: {}, Address: {}, Password: {}",
maildata[0], maildata[1], maildata[2]);
});
handles.push(handle);
}
handles.into_iter().for_each(|h| h.join().unwrap());
});
handle.join().unwrap();

Channels for passing hashmap between threads | stuck in loop | Rust

I am solving a problem for the website Exercism in rust, where I basically try to concurrently count how many times different letters occur in some text. I am doing this by passing hashmaps between threads, and somehow am in some kind of infinite loop. I think the issue is in my handling of the receiver, but I really don't know. Please help.
use std::collections::HashMap;
use std::thread;
use std::sync::mpsc;
use std::str;
pub fn frequency(input: &[&str], worker_count: usize) -> HashMap<char, usize> {
// Empty case
if input.is_empty() {
return HashMap::new();
}
// Flatten input, set workload for each thread, create hashmap to catch results
let mut flat_input = input.join("");
let workload = input.len() / worker_count;
let mut final_map: HashMap<char, usize> = HashMap::new();
let (tx, rx) = mpsc::channel();
for _i in 0..worker_count {
let task = flat_input.split_off(flat_input.len() - workload);
let tx_clone = mpsc::Sender::clone(&tx);
// Separate threads ---------------------------------------------
thread::spawn(move || {
let mut partial_map: HashMap<char, usize> = HashMap::new();
for letter in task.chars() {
match partial_map.remove(&letter) {
Some(count) => {
partial_map.insert(letter, count + 1);
},
None => {
partial_map.insert(letter, 1);
}
}
}
tx_clone.send(partial_map).expect("Didn't work fool");
});
// --------------------------------------------------
}
// iterate through the returned hashmaps to update the final map
for received in rx {
for (key, value) in received {
match final_map.remove(&key) {
Some(count) => {
final_map.insert(key, count + value);
},
None => {
final_map.insert(key, value);
}
}
}
}
return final_map;
}
Iterating on the receiver rx will block for new messages while senders exist. The ones you've cloned into the threads will drop out of scope when they're done, but you have the original sender tx still in scope.
You can force tx out of scope by dropping it manually:
for _i in 0..worker_count {
...
}
std::mem::drop(tx); // <--------
for received in rx {
...
}

Pass a matrix (Vec<Vec<f64>>) readonly to multiple threads

I am new to Rust and I am struggling with the concept of borrowing.
I want to load a Vec<Vec<f64>> matrix and then process it in parallel. However when I try to compile this piece of code I get error: capture of moved value: `matrix` [E0382] at the let _ = line.
This matrix is supposed to be readonly for the threads, they won't modify it. How can I pass it readonly and make the "moved value" error go away?
fn process(matrix: &Vec<Vec<f64>>) {
// do nothing for now
}
fn test() {
let filename = "matrix.tsv";
// loads matrix into a Vec<Vec<f64>>
let mut matrix = load_matrix(filename);
// Determine number of cpus
let ncpus = num_cpus::get();
println!("Number of cpus on this machine: {}", ncpus);
for i in 0..ncpus {
// In the next line the "error: capture of moved value: matrix" happens
let _ = thread::spawn(move || {
println!("Thread number: {}", i);
process(&matrix);
});
}
let d = Duration::from_millis(1000 * 1000);
thread::sleep(d);
}
Wrap the object to be shared into an Arc which stands for atomically reference counted (pointer). For each thread, clone this pointer and pass ownership of the clone to the thread. The wrapped object will be deallocated when it's no longer used by anything.
fn process(matrix: &Vec<Vec<f64>>) {
// do nothing for now
}
fn test() {
use std::sync::Arc;
let filename = "matrix.tsv";
// loads matrix into a Vec<Vec<f64>>
let mut matrix = Arc::new(load_matrix(filename));
// Determine number of cpus
let ncpus = num_cpus::get();
println!("Number of cpus on this machine: {}", ncpus);
for i in 0..ncpus {
let matrix = matrix.clone();
let _ = thread::spawn(move || {
println!("Thread number: {}", i);
process(&matrix);
});
}
let d = Duration::from_millis(1000 * 1000);
thread::sleep(d);
}

Application on OSX cannot spawn more than 2048 threads

I have a Rust application on on OSX firing up a large amount of threads as can be seen in the code below, however, after looking at how many max threads my version of OSX is allowed to create via the sysctl kern.num_taskthreads command, I can see that it is kern.num_taskthreads: 2048 which explains why I can't spin up over 2048 threads.
How do I go about getting past this hard limit?
let threads = 300000;
let requests = 1;
for _x in 0..threads {
println!("{}", _x);
let request_clone = request.clone();
let handle = thread::spawn(move || {
for _y in 0..requests {
request_clone.lock().unwrap().push((request::Request::new(request::Request::create_request())));
}
});
child_threads.push(handle);
}
Before starting, I'd encourage you to read about the C10K problem. When you get into this scale, there's a lot more things you need to keep in mind.
That being said, I'd suggest looking at mio...
a lightweight IO library for Rust with a focus on adding as little overhead as possible over the OS abstractions.
Specifically, mio provides an event loop, which allows you to handle a large number of connections without spawning threads. Unfortunately, I don't know of a HTTP library that currently supports mio. You could create one and be a hero to the Rust community!
Not sure how helpful this will be, but I was trying to create a small pool of threads that will create connections and then send them over to an event loop via a channel for reading.
I'm sure this code is probably pretty bad, but here it is anyways for examples. It uses the Hyper library, like you mentioned.
extern crate hyper;
use std::io::Read;
use std::thread;
use std::thread::{JoinHandle};
use std::sync::{Arc, Mutex};
use std::sync::mpsc::channel;
use hyper::Client;
use hyper::client::Response;
use hyper::header::Connection;
const TARGET: i32 = 100;
const THREADS: i32 = 10;
struct ResponseWithString {
index: i32,
response: Response,
data: Vec<u8>,
complete: bool
}
fn main() {
// Create a client.
let url: &'static str = "http://www.gooogle.com/";
let mut threads = Vec::<JoinHandle<()>>::with_capacity((TARGET * 2) as usize);
let conn_count = Arc::new(Mutex::new(0));
let (tx, rx) = channel::<ResponseWithString>();
for _ in 0..THREADS {
// Move var references into thread context
let conn_count = conn_count.clone();
let tx = tx.clone();
let t = thread::spawn(move || {
loop {
let idx: i32;
{
// Lock, increment, and release
let mut count = conn_count.lock().unwrap();
*count += 1;
idx = *count;
}
if idx > TARGET {
break;
}
let mut client = Client::new();
// Creating an outgoing request.
println!("Creating connection {}...", idx);
let res = client.get(url) // Get URL...
.header(Connection::close()) // Set headers...
.send().unwrap(); // Fire!
println!("Pushing response {}...", idx);
tx.send(ResponseWithString {
index: idx,
response: res,
data: Vec::<u8>::with_capacity(1024),
complete: false
}).unwrap();
}
});
threads.push(t);
}
let mut responses = Vec::<ResponseWithString>::with_capacity(TARGET as usize);
let mut buf: [u8; 1024] = [0; 1024];
let mut completed_count = 0;
loop {
if completed_count >= TARGET {
break; // No more work!
}
match rx.try_recv() {
Ok(r) => {
println!("Incoming response! {}", r.index);
responses.push(r)
},
_ => { }
}
for r in &mut responses {
if r.complete {
continue;
}
// Read the Response.
let res = &mut r.response;
let data = &mut r.data;
let idx = &r.index;
match res.read(&mut buf) {
Ok(i) => {
if i == 0 {
println!("No more data! {}", idx);
r.complete = true;
completed_count += 1;
}
else {
println!("Got data! {} => {}", idx, i);
for x in 0..i {
data.push(buf[x]);
}
}
}
Err(e) => {
panic!("Oh no! {} {}", idx, e);
}
}
}
}
}

Resources