We have a while loop that searches for a string in hundred of tables in several hundred databases. We don't need to find the exact location of the string, but just want to know if it exists somewhere.
In each iteration we pick a table at random and search for the string in that table
After every 500 iteration, if the string is not already found, we pop one character from the end of the string and search for the remaining string, so and so forth.
let mut i = 1;
let mut str_not_found = true;
while str_not_found {
if i % 500 == 0{
search_str.pop()
}
//search for string in the table . if it exists set str_not_found to false
}
This works, but we want to parallelize in it. What is the best way to achieve this?
Cargo.toml
[dependencies]
rand = "0.8.5"
crossbeam-channel = "0.5.4"
crossbeam-utils = "0.8.8"
main.rs
use rand::Rng;
use std::thread;
use std::time::Duration;
use std::sync::{Arc, Mutex};
use crossbeam_channel::bounded;
use crossbeam_utils::thread as cthread;
fn main() {
let workers = 15;
let mut rng = rand::thread_rng();
let (sx, rx) = bounded(workers);
let mut search_str = String::from("one");
let found = Arc::new(Mutex::new(false));
println!("FOUND {:?}", *found.lock().unwrap());
cthread::scope(|s| {
let mut idx: u32 = 0;
while !*found.lock().unwrap() && search_str.len() > 0 {
sx.send(search_str.clone()).unwrap();
let r1 = rx.clone();
let _sleep = rng.gen_range(0..30);
s.spawn({
let f = Arc::clone(&found);
move |_| {
thread::sleep(Duration::from_secs(_sleep + 1));
// temporary found the string under 45 thread/worker
if idx == 45 {
let mut tmp = f.lock().unwrap();
*tmp = true;
}
println!("Worker: {:?}, Sleeped: {:?}, Searched : {:?}", idx, _sleep, r1.recv().unwrap());
}
});
if idx > 0 && idx % 15 == 0 {
search_str.pop();
}
idx += 1;
}
}).unwrap();
println!("FOUND {:?}", *found.lock().unwrap());
println!("Bye!");
}
Related
Trying to respect Rust safety rules leads me to write code that is, in this case, less clear than the alternative.
It's marginal, but must be a very common pattern, so I wonder if there's any better way.
The following example doesn't compile:
async fn query_all_items() -> Vec<u32> {
let mut items = vec![];
let limit = 10;
loop {
let response = getResponse().await;
// response is moved here
items.extend(response);
// can't do this, response is moved above
if response.len() < limit {
break;
}
}
items
}
In order to satisfy Rust safety rules, we can pre-compute the break condition:
async fn query_all_items() -> Vec<u32> {
let mut items = vec![];
let limit = 10;
loop {
let response = getResponse().await;
let should_break = response.len() < limit;
// response is moved here
items.extend(response);
// meh
if should_break {
break;
}
}
items
}
Is there any other way?
I agree with Daniel's point that this should be a while rather than a loop, though I'd move the logic to the while rather than creating a boolean:
let mut len = limit;
while len >= limit {
let response = queryItems(limit).await?;
len = response.len();
items.extend(response);
}
Not that you should do this, but an async stream version is possible. However a plain old loop is much easier to read.
use futures::{future, stream, StreamExt}; // 0.3.19
use rand::{
distributions::{Distribution, Uniform},
rngs::ThreadRng,
};
use std::sync::{Arc, Mutex};
use tokio; // 1.15.0
async fn get_response(rng: Arc<Mutex<ThreadRng>>) -> Vec<u32> {
let mut rng = rng.lock().unwrap();
let range = Uniform::from(0..100);
let len_u32 = range.sample(&mut *rng);
let len_usize = usize::try_from(len_u32).unwrap();
vec![len_u32; len_usize]
}
async fn query_all_items() -> Vec<u32> {
let rng = Arc::new(Mutex::new(ThreadRng::default()));
stream::iter(0..)
.then(|_| async { get_response(Arc::clone(&rng)).await })
.take_while(|v| future::ready(v.len() >= 10))
.collect::<Vec<_>>()
.await
.into_iter()
.flatten()
.collect()
}
#[tokio::main]
async fn main() {
// [46, 46, 46, ..., 78, 78, 78], or whatever random list you get
println!("{:?}", query_all_items().await);
}
I would do this in a while loop since the while will surface the flag more easily.
fn query_all_items () -> Vec<Item> {
let items = vec![];
let limit = 10;
let mut limit_reached = false;
while limit_reached {
let response = queryItems(limit).await?;
limit_reached = response.len() >= limit;
items.extend(response);
}
items
}
Without context it's hard to advise ideal code. I would do:
fn my_body_is_ready() -> Vec<u32> {
let mut acc = vec![];
let min = 10;
loop {
let foo = vec![42];
if foo.len() < min {
acc.extend(foo);
break acc;
} else {
acc.extend(foo);
}
}
}
You can also consider this as, is it possible to URLify a string in place in rust?
For example,
Problem statement: Replace whitespace with %20
Assumption: String will have enough capacity left to accommodate new characters.
Input: Hello how are you
Output: Hello%20how%20are%20you
I know there are ways to do this if we don't have to do this "in place". I am solving a problem that explicitly states that you have to update in place.
If there isn't any safe way to do this, is there any particular reason behind that?
[Edit]
I was able to solve this using unsafe approach, but would appreciate a better approach than this. More idiomatic approach if there is.
fn space_20(sentence: &mut String) {
if !sentence.is_ascii() {
panic!("Invalid string");
}
let chars: Vec<usize> = sentence.char_indices().filter(|(_, ch)| ch.is_whitespace()).map(|(idx, _)| idx ).collect();
let char_count = chars.len();
if char_count == 0 {
return;
}
let sentence_len = sentence.len();
sentence.push_str(&"*".repeat(char_count*2)); // filling string with * so that bytes array becomes of required size.
unsafe {
let bytes = sentence.as_bytes_mut();
let mut final_idx = sentence_len + (char_count * 2) - 1;
let mut i = sentence_len - 1;
let mut char_ptr = char_count - 1;
loop {
if i != chars[char_ptr] {
bytes[final_idx] = bytes[i];
if final_idx == 0 {
// all elements are filled.
println!("all elements are filled.");
break;
}
final_idx -= 1;
} else {
bytes[final_idx] = '0' as u8;
bytes[final_idx - 1] = '2' as u8;
bytes[final_idx - 2] = '%' as u8;
// final_idx is of type usize cannot be less than 0.
if final_idx < 3 {
println!("all elements are filled at start.");
break;
}
final_idx -= 3;
// char_ptr is of type usize cannot be less than 0.
if char_ptr > 0 {
char_ptr -= 1;
}
}
if i == 0 {
// all elements are parsed.
println!("all elements are parsed.");
break;
}
i -= 1;
}
}
}
fn main() {
let mut sentence = String::with_capacity(1000);
sentence.push_str(" hello, how are you?");
// sentence.push_str("hello, how are you?");
// sentence.push_str(" hello, how are you? ");
// sentence.push_str(" ");
// sentence.push_str("abcd");
space_20(&mut sentence);
println!("{}", sentence);
}
An O(n) solution that neither uses unsafe nor allocates (provided that the string has enough capacity), using std::mem::take:
fn urlify_spaces(text: &mut String) {
const SPACE_REPLACEMENT: &[u8] = b"%20";
// operating on bytes for simplicity
let mut buffer = std::mem::take(text).into_bytes();
let old_len = buffer.len();
let space_count = buffer.iter().filter(|&&byte| byte == b' ').count();
let new_len = buffer.len() + (SPACE_REPLACEMENT.len() - 1) * space_count;
buffer.resize(new_len, b'\0');
let mut write_pos = new_len;
for read_pos in (0..old_len).rev() {
let byte = buffer[read_pos];
if byte == b' ' {
write_pos -= SPACE_REPLACEMENT.len();
buffer[write_pos..write_pos + SPACE_REPLACEMENT.len()]
.copy_from_slice(SPACE_REPLACEMENT);
} else {
write_pos -= 1;
buffer[write_pos] = byte;
}
}
*text = String::from_utf8(buffer).expect("invalid UTF-8 during URL-ification");
}
(playground)
Basically, it calculates the final length of the string, sets up a reading pointer and a writing pointer, and translates the string from right to left. Since "%20" has more characters than " ", the writing pointer never catches up with the reading pointer.
Is it possible to do this without unsafe?
Yes like this:
fn main() {
let mut my_string = String::from("Hello how are you");
let mut insert_positions = Vec::new();
let mut char_counter = 0;
for c in my_string.chars() {
if c == ' ' {
insert_positions.push(char_counter);
char_counter += 2; // Because we will insert two extra chars here later.
}
char_counter += 1;
}
for p in insert_positions.iter() {
my_string.remove(*p);
my_string.insert(*p, '0');
my_string.insert(*p, '2');
my_string.insert(*p, '%');
}
println!("{}", my_string);
}
Here is the Playground.
But should you do it?
As discussed for example here on Reddit this is almost always not the recommended way of doing this, because both remove and insert are O(n) operations as noted in the documentation.
Edit
A slightly better version:
fn main() {
let mut my_string = String::from("Hello how are you");
let mut insert_positions = Vec::new();
let mut char_counter = 0;
for c in my_string.chars() {
if c == ' ' {
insert_positions.push(char_counter);
char_counter += 2; // Because we will insert two extra chars here later.
}
char_counter += 1;
}
for p in insert_positions.iter() {
my_string.remove(*p);
my_string.insert_str(*p, "%20");
}
println!("{}", my_string);
}
and the corresponding Playground.
I get this error while compiling my code: please help me.
use of moved value: `path`
value used here after moverustc(E0382)
main.rs(16, 9): move occurs because `path` has type `std::result::Result`, which does not implement the `Copy` trait
main.rs(18, 32): `path` moved due to this method call
main.rs(19, 29): value used here after move
fn main() -> std::io::Result<()> {
let paths = fs::read_dir("./").unwrap();
let mut text = String::new();
let mut idx = 0;
for path in paths {
// let path_str = path.unwrap().path().display().to_string();
let path_str = if path.unwrap().path().is_dir() {
path.unwrap().path().display().to_string()
}
else {
let mut path = path.unwrap().path().display().to_string();
path.push_str("[file]");
path
};
let path_trimed = path_str.trim_start_matches("./");
idx += 1;
println!("{} -> file/folder: {}", idx + 1, path_trimed);
text.push_str(&path_trimed);
text.push_str("\n");
}
// println!("{}", text);
// writing the string to file
let mut file = fs::File::create("file_list.txt")?;
file.write_all(&text.as_bytes())?;
Ok(())
}
I think the problem is that you're unwrapping path many times, each unwrapping borrows the variable path, so you rust will complain when you try to unwrap a second time.
I suggest you try to unwrap it just once:
use std::fs;
use std::io::Write;
fn main() -> std::io::Result<()> {
let paths = fs::read_dir("./").unwrap();
let mut text = String::new();
let mut idx = 0;
for path in paths {
// let path_str = path.unwrap().path().display().to_string();
let path = path.unwrap().path();
let path_str = if path.is_dir() {
path.display().to_string()
} else {
let mut path = path.display().to_string();
path.push_str("[file]");
path
};
let path_trimed = path_str.trim_start_matches("./");
idx += 1;
println!("{} -> file/folder: {}", idx + 1, path_trimed);
text.push_str(&path_trimed);
text.push_str("\n");
}
// println!("{}", text);
// writing the string to file
let mut file = fs::File::create("file_list.txt")?;
file.write_all(&text.as_bytes())?;
Ok(())
}
I have a task to shuffle words but the first and last letter of every word must be unchanged. When I try to use filter() it doesn't work properly.
const SEPARATORS: &str = " ,;:!?./%*$=+)#_-('\"&1234567890\r\n";
fn main() {
print!("MAIN:{:?}", mix("Evening,morning"));
}
fn mix(s: &str) -> String {
let mut a: Vec<char> = s.chars().collect();
for group in a.split_mut(|num| SEPARATORS.contains(*num)) {
if group.len() > 4 {
let k = group.first().unwrap().clone();
let c = group[group.len() - 1].clone();
group
.chunks_exact_mut(2)
.filter(|x| x != &[k])
.for_each(|x| x.swap(0, 1))
}
}
let s: String = a.iter().collect();
s
}
Is this what you are looking for?
fn mix(s: &str) -> String {
let mut a: Vec<char> = s.chars().collect();
for words in a.split_mut(|num| SEPARATORS.contains(*num)) {
if words.len() > 4 {
let initial_letter = words.first().unwrap().clone();
let last_letter = words[words.len() - 1].clone();
words[0] = last_letter;
words[words.len() - 1] = initial_letter;
}
}
let s: String = a.iter().collect();
s
}
I'm trying to use a Condvar to limit the number of threads that are active at any given time. I'm having a hard time finding good examples on how to use Condvar. So far I have:
use std::sync::{Arc, Condvar, Mutex};
use std::thread;
fn main() {
let thread_count_arc = Arc::new((Mutex::new(0), Condvar::new()));
let mut i = 0;
while i < 100 {
let thread_count = thread_count_arc.clone();
thread::spawn(move || {
let &(ref num, ref cvar) = &*thread_count;
{
let mut start = num.lock().unwrap();
if *start >= 20 {
cvar.wait(start);
}
*start += 1;
}
println!("hello");
cvar.notify_one();
});
i += 1;
}
}
The compiler error given is:
error[E0382]: use of moved value: `start`
--> src/main.rs:16:18
|
14 | cvar.wait(start);
| ----- value moved here
15 | }
16 | *start += 1;
| ^^^^^ value used here after move
|
= note: move occurs because `start` has type `std::sync::MutexGuard<'_, i32>`, which does not implement the `Copy` trait
I'm entirely unsure if my use of Condvar is correct. I tried staying as close as I could to the example on the Rust API. Wwat is the proper way to implement this?
Here's a version that compiles:
use std::{
sync::{Arc, Condvar, Mutex},
thread,
};
fn main() {
let thread_count_arc = Arc::new((Mutex::new(0u8), Condvar::new()));
let mut i = 0;
while i < 100 {
let thread_count = thread_count_arc.clone();
thread::spawn(move || {
let (num, cvar) = &*thread_count;
let mut start = cvar
.wait_while(num.lock().unwrap(), |start| *start >= 20)
.unwrap();
// Before Rust 1.42, use this:
//
// let mut start = num.lock().unwrap();
// while *start >= 20 {
// start = cvar.wait(start).unwrap()
// }
*start += 1;
println!("hello");
cvar.notify_one();
});
i += 1;
}
}
The important part can be seen from the signature of Condvar::wait_while or Condvar::wait:
pub fn wait_while<'a, T, F>(
&self,
guard: MutexGuard<'a, T>,
condition: F
) -> LockResult<MutexGuard<'a, T>>
where
F: FnMut(&mut T) -> bool,
pub fn wait<'a, T>(
&self,
guard: MutexGuard<'a, T>
) -> LockResult<MutexGuard<'a, T>>
This says that wait_while / wait consumes the guard, which is why you get the error you did - you no longer own start, so you can't call any methods on it!
These functions are doing a great job of reflecting how Condvars work - you give up the lock on the Mutex (represented by start) for a while, and when the function returns you get the lock again.
The fix is to give up the lock and then grab the lock guard return value from wait_while / wait. I've also switched from an if to a while, as encouraged by huon.
For reference, the usual way to have a limited number of threads in a given scope is with a Semaphore.
Unfortunately, Semaphore was never stabilized, was deprecated in Rust 1.8 and was removed in Rust 1.9. There are crates available that add semaphores on top of other concurrency primitives.
let sema = Arc::new(Semaphore::new(20));
for i in 0..100 {
let sema = sema.clone();
thread::spawn(move || {
let _guard = sema.acquire();
println!("{}", i);
})
}
This isn't quite doing the same thing: since each thread is not printing the total number of the threads inside the scope when that thread entered it.
I realized the code I provided didn't do exactly what I wanted it to, so I'm putting this edit of Shepmaster's code here for future reference.
use std::sync::{Arc, Condvar, Mutex};
use std::thread;
fn main() {
let thread_count_arc = Arc::new((Mutex::new(0u8), Condvar::new()));
let mut i = 0;
while i < 150 {
let thread_count = thread_count_arc.clone();
thread::spawn(move || {
let x;
let &(ref num, ref cvar) = &*thread_count;
{
let start = num.lock().unwrap();
let mut start = if *start >= 20 {
cvar.wait(start).unwrap()
} else {
start
};
*start += 1;
x = *start;
}
println!("{}", x);
{
let mut counter = num.lock().unwrap();
*counter -= 1;
}
cvar.notify_one();
});
i += 1;
}
println!("done");
}
Running this in the playground should show more or less expected behavior.
You want to use a while loop, and re-assign start at each iteration, like:
fn main() {
let thread_count_arc = Arc::new((Mutex::new(0), Condvar::new()));
let mut i = 0;
while i < 100 {
let thread_count = thread_count_arc.clone();
thread::spawn(move || {
let &(ref num, ref cvar) = &*thread_count;
let mut start = num.lock().unwrap();
while *start >= 20 {
let current = cvar.wait(start).unwrap();
start = current;
}
*start += 1;
println!("hello");
cvar.notify_one();
});
i += 1;
}
}
See also some article on the topic:
https://medium.com/#polyglot_factotum/rust-concurrency-five-easy-pieces-871f1c62906a
https://medium.com/#polyglot_factotum/rust-concurrency-patterns-condvars-and-locks-e278f18db74f