I'm using Rust to download huge amounts of stock market data, around 50,000 GET requests per cycle. To make the process go significantly faster, I've been able to use multithreading. My code so far looks like this:
// Instantiate a channel so threads can send data to main thread
let (s, r) = channel();
// Vector to store all threads created
let mut threads = Vec::new();
// Iterate through every security in the universe
for security in universe {
// Clone the sender
let thread_send = s.clone();
// Create a thread with a closure that makes 5 get requests for the current security
let t = thread::spawn(move || {
// Download the 5 price vectors and send everything in a tuple to the main thread
let price_vectors = download_security(&security);
let tuple = (security, price_vectors.0, price_vectors.1, price_vectors.2, price_vectors.3, price_vectors.4);
&thread_send.send(tuple).unwrap();
});
// PAUSE THE MAIN THREAD BECAUSE OF THE ERROR I'M GETTING
thread::sleep(Duration::from_millis(20));
// Add the new thread to the threads vector
threads.push(t);
};
drop(s);
// Join all the threads together so the main thread waits for their completion
for t in threads {
t.join();
};
The download_security() function that each thread calls simply makes 5 GET requests to download price data (minutely, hourly, daily, weekly, monthly data). I'm using the ureq crate to make these requests. The download_security() function looks like this:
// Call minutely data and let thread sleep for arbitrary amount of time
let minute_text = ureq::get(&minute_link).call().unwrap().into_string().unwrap();
thread::sleep(Duration::from_millis(1000));
// Call hourly data and let thread sleep for arbitrary amount of time
let hour_text = ureq::get(&hour_link).call().unwrap().into_string().unwrap();
thread::sleep(Duration::from_millis(1000));
// Call daily data and let thread sleep for arbitrary amount of time
let day_text = ureq::get(&day_link).call().unwrap().into_string().unwrap();
thread::sleep(Duration::from_millis(1000));
// Call weekly data and let thread sleep for arbitrary amount of time
let week_text = ureq::get(&week_link).call().unwrap().into_string().unwrap();
thread::sleep(Duration::from_millis(1000));
// Call monthly data and let thread sleep for arbitrary amount of time
let month_text = ureq::get(&month_link).call().unwrap().into_string().unwrap();
thread::sleep(Duration::from_millis(1000));
Now, the reason I'm putting my threads to sleep throughout this code is because it seems that whenever I make too many HTTP requests too fast, I get this strange error:
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Transport(Transport { kind: Dns, message: None, url: Some(Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.polygon.io")), port: None, path: "/v2/aggs/ticker/SPHB/range/1/minute/2021-05-22/2021-10-22", query: Some("adjusted=true&sort=asc&limit=200&apiKey=wo_oZg8qxLYzwo3owc6mQ1EIOp7yCr0g"), fragment: None }), source: Some(Custom { kind: Uncategorized, error: "failed to lookup address information: nodename nor servname provided, or not known" }) })', src/main.rs:243:54
When I increase the amount of time that my main thread sleeps after creating a new subthread, OR the amount of time that my subthreads sleep after making each of the 5 GET requests, the number of these errors goes down. When the sleeps are too short, I'll see this error printed out for 90%+ of the securities I try to download. When the sleeps are longer, everything works perfectly, except that the process takes WAY too long. This is frustrating because I need this process to be as fast as possible, preferably <1 minute for all 10,000 securities.
I'm running macOS Big Sur on an M1 Mac Mini. Is there some kind of fundamental limit on my OS as to how many GET requests I can make per second?
Any help would be greatly appreciated.
Thank you!
Related
I have a worker thread implemented something like this:
impl WorkerThread {
fn new() -> Self {
let (main_sender, thread_receiver) = crossbeam_channel::unbounded();
let (thread_sender, main_receiver) = crossbeam_channel::unbounded();
let _ = thread::spawn(move || loop {
if let Ok(s) = thread_receiver.recv() {
let result = do_work();
thread_sender.send(result).unwrap();
}
});
Self {
main_sender,
main_receiver,
}
}
}
Is this infinite loop considered a Busy Waiting?
The documentation for the crossbeam_channel::Receiver::recv says:
Blocks the current thread until a message is received or the channel is empty and disconnected.
But what does this Blocks the current thread mean?
On my Ubuntu, I looked with htop on the CPU load and with this blocking call, I saw that only one core (the main thread) was busy. But if I replace recv() with non-blocking try_recv() the htop shows that my second core load increases up to 100%.
Can I assume the identical behavior on every platform (Linux, Mac, Windows, etc)? Or is it how my particular systems scheduler works?
I'm new to Rust.
For learning purposes, I'm writing a simple program to search for files in Linux, and it uses a recursive function:
fn ffinder(base_dir:String, prmtr:&'static str, e:bool, h:bool) -> std::io::Result<()>{
let mut handle_vec = vec![];
let pth = std::fs::read_dir(&base_dir)?;
for p in pth {
let p2 = p?.path().clone();
if p2.is_dir() {
if !h{ //search doesn't include hidden directories
let sstring:String = get_fname(p2.display().to_string());
let slice:String = sstring[..1].to_string();
if slice != ".".to_string() {
let handle = thread::spawn(move || {
ffinder(p2.display().to_string(),prmtr,e,h);
});
handle_vec.push(handle);
}
}
else {//search include hidden directories
let handle2 = thread::spawn(move || {
ffinder(p2.display().to_string(),prmtr,e,h);
});
handle_vec.push(handle2);
}
}
else {
let handle3 = thread::spawn(move || {
if compare(rmv_underline(get_fname(p2.display().to_string())),rmv_underline(prmtr.to_string()),e){
println!("File found at: {}",p2.display().to_string().blue());
}
});
handle_vec.push(handle3);
}
}
for h in handle_vec{
h.join().unwrap();
}
Ok(())
}
I've tried to use multi threading (thread::spawn), however it can create too many threads, exploding the OS limit, which breaks the program execution.
Is there a way to multi thread with recursion, using a safe,limited (fixed) amount of threads?
As one of the commenters mentioned, this is an absolutely perfect case for using Rayon. The blog post mentioned doesn't show how Rayon might be used in recursion, only making an allusion to crossbeam's scoped threads with a broken link. However, Rayon provides its own scoped threads implementation that solves your problem as well in that only uses as many threads as you have cores available, avoiding the error you ran into.
Here's the documentation for it:
https://docs.rs/rayon/1.0.1/rayon/fn.scope.html
Here's an example from some code I recently wrote. Basically what it does is recursively scan a folder, and each time it nests into a folder it creates a new job to scan that folder while the current thread continues. In my own tests it vastly outperforms a single threaded approach.
let source = PathBuf::from("/foo/bar/");
let (tx, rx) = mpsc::channel();
rayon::scope(|s| scan(&source, tx, s));
fn scan<'a, U: AsRef<Path>>(
src: &U,
tx: Sender<(Result<DirEntry, std::io::Error>, u64)>,
scope: &Scope<'a>,
) {
let dir = fs::read_dir(src).unwrap();
dir.into_iter().for_each(|entry| {
let info = entry.as_ref().unwrap();
let path = info.path();
if path.is_dir() {
let tx = tx.clone();
scope.spawn(move |s| scan(&path, tx, s)) // Recursive call here
} else {
// dbg!("{}", path.as_os_str().to_string_lossy());
let size = info.metadata().unwrap().len();
tx.send((entry, size)).unwrap();
}
});
}
I'm not an expert on Rayon, but I'm fairly certain the threading strategy works like this:
Rayon creates a pool of threads to match the number of logical cores you have available in your environment. The first call to the scoped function creates a job that the first available thread "steals" from the queue of jobs available. Each time we make another recursive call, it doesn't necessarily execute immediately, but it creates a new job that an idle thread can then "steal" from the queue. If all of the threads are busy, the job queue just fills up each time we make another recursive call, and each time a thread finishes its current job it steals another job from the queue.
The full code can be found here: https://github.com/1Dragoon/fcp
(Note that repo is a work in progress and the code there is currently typically broken and probably won't work at the time you're reading this.)
As an exercise to the reader, I'm more of a sys admin than an actual developer, so I also don't know if this is the ideal approach. From Rayon's documentation linked earlier:
scope() is a more flexible building block compared to join(), since a loop can be used to spawn any number of tasks without recursing
The language of that is a bit confusing. I'm not sure what they mean by "without recursing". Join seems to intend for you to already have tasks known about ahead of time and to execute them in parallel as threads become available, whereas scope seems to be more aimed at only creating jobs when you need them rather than having to know everything you need to do in advance. Either that or I'm understanding their meaning backwards.
Tokio has the same example of a simple TCP echo server on its:
GitHub main page (https://github.com/tokio-rs/tokio)
API reference main page (https://docs.rs/tokio/0.2.18/tokio/)
However, in both pages, there is no explanation of what's actually going on. Here's the example, slightly modified so that the main function does not return Result<(), Box<dyn std::error::Error>>:
use tokio::net::TcpListener;
use tokio::prelude::*;
#[tokio::main]
async fn main() {
if let Ok(mut tcp_listener) = TcpListener::bind("127.0.0.1:8080").await {
while let Ok((mut tcp_stream, _socket_addr)) = tcp_listener.accept().await {
tokio::spawn(async move {
let mut buf = [0; 1024];
// In a loop, read data from the socket and write the data back.
loop {
let n = match tcp_stream.read(&mut buf).await {
// socket closed
Ok(n) if n == 0 => return,
Ok(n) => n,
Err(e) => {
eprintln!("failed to read from socket; err = {:?}", e);
return;
}
};
// Write the data back
if let Err(e) = tcp_stream.write_all(&buf[0..n]).await {
eprintln!("failed to write to socket; err = {:?}", e);
return;
}
}
});
}
}
}
After reading the Tokio documentation (https://tokio.rs/docs/overview/), here's my mental model of this example. A task is spawned for each new TCP connection. And a task is ended whenever a read/write error occurs, or when the client ends the connection (i.e. n == 0 case). Therefore, if there are 20 connected clients at a point in time, there would be 20 spawned tasks. However, under the hood, this is NOT equivalent to spawning 20 threads to handle the connected clients concurrently. As far as I understand, this is basically the problem that asynchronous runtimes are trying to solve. Correct so far?
Next, my mental model is that a tokio scheduler (e.g. the multi-threaded threaded_scheduler which is the default for apps, or the single-threaded basic_scheduler which is the default for tests) will schedule these tasks concurrently on 1-to-N threads. (Side question: for the threaded_scheduler, is N fixed during the app's lifetime? If so, is it equal to num_cpus::get()?). If one task is .awaiting for the read or write_all operations, then the scheduler can use the same thread to perform more work for one of the other 19 tasks. Still correct?
Finally, I'm curious whether the outer code (i.e. the code that is .awaiting for tcp_listener.accept()) is itself a task? Such that in the 20 connected clients example, there aren't really 20 tasks but 21: one to listen for new connections + one per connection. All of these 21 tasks could be scheduled concurrently on one or many threads, depending on the scheduler. In the following example, I wrap the outer code in a tokio::spawn and .await the handle. Is it completely equivalent to the example above?
use tokio::net::TcpListener;
use tokio::prelude::*;
#[tokio::main]
async fn main() {
let main_task_handle = tokio::spawn(async move {
if let Ok(mut tcp_listener) = TcpListener::bind("127.0.0.1:8080").await {
while let Ok((mut tcp_stream, _socket_addr)) = tcp_listener.accept().await {
tokio::spawn(async move {
// ... same as above ...
});
}
}
});
main_task_handle.await.unwrap();
}
This answer is a summary of an answer I received on Tokio's Discord from Alice Ryhl. Big thank you!
First of all, indeed, for the multi-threaded scheduler, the number of OS threads is fixed to num_cpus.
Second, Tokio can swap the currently running task at every .await on a per-thread basis.
Third, the main function runs in its own task, which is spawned by the #[tokio::main] macro.
Therefore, for the first code block example, if there are 20 connected clients, there would be 21 tasks: one for the main macro + one for each of the 20 open TCP streams. For the second code block example, there would be 22 tasks because of the extra outer tokio::spawn but it's needless and doesn't add any concurrency.
I have written a bot for the Discord chat service using the discord-rs library. This library gives me events when they arise in a single thread in a main loop:
fn start() {
// ...
loop {
let event = match connection.recv_event() {
Ok(event) => event,
Err(err) => { ... },
}
}
}
I want to add some timers and other things which are calculated in their own threads and which must notify me to do something in the main loop's thread. I also want to add Twitter support. So it may look as this:
(Discord's network connection, Twitter network connection, some timer in another thread) -> main loop
This will look something like this:
fn start() {
// ...
loop {
let event = match recv_events() {
// 1. if Discord - do something with discord
// 2. if timer - handle timer's notification
// 3. if Twitter network connection - handle twitter
}
}
}
In raw C and C sockets, it could be done by (e)polling them but here I have no idea how to do that in Rust or if it is even possible. I think I want something like poll of few different sources which would provide me objects of different types.
I guess this could be implemented if I provide a wrapper for mio's Evented trait and use mio's poll as described in the Deadline example.
Is there any way to combine these events?
This library gives me events when they arise in a single thread in a main loop
The "single thread" thing is only true for small bots. As soon as you reach the 2500 guilds limit, Discord will refuse to connect your bot in a normal way. You'll have to use sharding. And I guess you're not going to provision new virtual servers for your bot shards. Chance is, you will spawn new threads instead, one event loop per shard.
Here is how I do it, BTW:
fn event_loop(shard_id: u8, total_shards: u8){
loop {
let bot = Discord::from_bot_token("...").expect("!from_bot_token");
let (mut dc, ready_ev) = bot.connect_sharded(shard_id, total_shards).expect("!connect");
// ...
}
}
fn main() {
let total_shards = 10;
for shard_id in 0..total_shards {
sleep(Duration::from_secs(6)); // There must be a five-second pause between connections from one IP.
ThreadBuilder::new().name (fomat! ("shard " (shard_id)))
.spawn (move || {
loop {
if let Err (err) = catch_unwind (move || event_loop (shard_id, total_shards)) {
log! ("shard " (shard_id) " panic: " (gstuff::any_to_str (&*err) .unwrap_or ("")));
sleep (Duration::from_secs (10));
continue} // Panic restarts the shard.
break}
}) .expect ("!spawn");
}
}
I want to add some timers and other things which are calculated in their own threads and which must notify me to do something in the main loop's thread
Option 1. Don't.
Chance is, you don't really need to come back to the Discord event loop! Let's say you want to post a reply, to update an embed, etc. You do not need the Discord event loop to do that!
Discord API goes in two parts:
1) Websocket API, represented by the Connection, is used to get events from Discord.
2) REST API, represented by the Discord interface, is used to send events out.
You can send events from pretty much anywhere. From any thread. Maybe even from your timers.
Discord is Sync. Wrap it in the Arc and share it with your timers and threads.
Option 2. Seize the opportunity.
Even though recv_event doesn't have a timeout, Discord will be constantly sending you new events. Users are signing in, signing out, typing, posting messages, starting videogames, editing stuff and what not. Indeed, if the stream of events stops then you have a problem with your Discord connection (for my bot I've implemented a High Availability failover based on that signal).
You could share a deque with your threads and timers. Once the timer is finished it will post a little something to the deque, then the even loop will check the deque for new things to do once Discord wakes it with a new event.
Option 3. Bird's-eye view.
As belst have pointed out, you could start a generic event loop, a loop "to rule them all", then lift Discord events into that loop. This is particularly interesting because with sharding you're going to have multiple event loops.
So, Discord event loop -> simple event filter -> channel -> main event loop.
Option 4. Sharded.
If you want your bot to stay online during code upgrades and restarts, then you should provision for a way to restart each shard separately (or otherwise implement a High Availability failover on the shard level, like I did). Because you can't immediately connect all your shards after a process restart, Discord won't let you.
If all your shards share the same process, then after that process restarts you have to wait five seconds before attaching a new shard. With 10 shards it's almost a minute of bot downtime.
One way to separate the shard restarts is to dedicate a process to every shard. Then when you need to upgrade the bot, you'd restart each process separately. That way you still have to wait five to six seconds per shard, but your user's don't.
Even better is that you now need to restart the Discord event loop processes only for discord-rs upgrades and similar maintance-related tasks. Your main event loop, on the other hand, can be restarted immediately and as often as you like. This should speed up the compile-run-test loop considerably.
So, Discord event loop, in a separate shard process -> simple event filter -> RPC or database -> main event loop, in a separate process.
In your case I would just start a thread for each service you need and then use a mpsc channel to send the events to the main loop.
Example:
use std::thread;
use std::sync::mpsc::channel;
enum Event {
Discord(()),
Twitter(()),
Timer(()),
}
fn main() {
let (tx, rx) = channel();
// discord
let txprime = tx.clone();
thread::spawn(move || {
loop {
// discord loop
txprime.send(Event::Discord(())).unwrap()
}
});
// twitter
let txprime = tx.clone();
thread::spawn(move || {
loop {
// twitter loop
txprime.send(Event::Twitter(())).unwrap()
}
});
// timer
let txprime = tx.clone();
thread::spawn(move || {
loop {
// timer loop
txprime.send(Event::Timer(())).unwrap()
}
});
// Main loop
loop {
match rx.recv().unwrap() {
Event::Discord(d) => unimplemented!(),
Event::Twitter(t) => unimplemented!(),
Event::Timer(t) => unimplemented!(),
}
}
}
I have a large vector of Hyper HTTP request futures and want to resolve them into a vector of results. Since there is a limit of maximum open files, I want to limit concurrency to N futures.
I've experimented with Stream::buffer_unordered but seems like it executed futures one by one.
We've used code like this in a project to avoid opening too many TCP sockets. These futures have Hyper futures within, so it seems exactly the same case.
// Convert the iterator into a `Stream`. We will process
// `PARALLELISM` futures at the same time, but with no specified
// order.
let all_done =
futures::stream::iter(iterator_of_futures.map(Ok))
.buffer_unordered(PARALLELISM);
// Everything after here is just using the stream in
// some manner, not directly related
let mut successes = Vec::with_capacity(LIMIT);
let mut failures = Vec::with_capacity(LIMIT);
// Pull values off the stream, dividing them into success and
// failure buckets.
let mut all_done = all_done.into_future();
loop {
match core.run(all_done) {
Ok((None, _)) => break,
Ok((Some(v), next_all_done)) => {
successes.push(v);
all_done = next_all_done.into_future();
}
Err((v, next_all_done)) => {
failures.push(v);
all_done = next_all_done.into_future();
}
}
}
This is used in a piece of example code, so the event loop (core) is explicitly driven. Watching the number of file handles used by the program showed that it was capped. Additionally, before this bottleneck was added, we quickly ran out of allowable file handles, whereas afterward we did not.