Possible memory leak with mem::swap in tokio routine [closed] - rust

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 months ago.
Improve this question
The process will killed by OOM, it seems memory can't be released in somewhere, where does the problem arise? static, routine or mem::swap? Or should I solve it.
#[macro_use]
extern crate lazy_static;
use std::mem;
use std::sync::Arc;
use parking_lot::Mutex;
lazy_static! {
pub static ref Buffer: Arc<Mutex<Vec<i64>>> = Arc::new(Mutex::new(Vec::with_capacity(100)));
}
pub async fn consume() {
let mut lk = Buffer.lock();
lk.clear();
drop(lk);
}
#[tokio::main]
async fn main() {
let mut vec: Vec<i64> = Vec::with_capacity(100);
loop {
vec.push(1);
if vec.len() == 100 {
let mut lk = Buffer.lock();
mem::swap(&mut vec, &mut *lk);
drop(lk);
tokio::spawn(async move {
consume().await
});
}
}
}

I think this is what you expect this code to do (please correct me if this is wrong):
vec is filled up to 100 elements and put into buffer.
while it is being filled up again, the background task frees the buffer.
rinse and repeat.
But consider what happens when your main loop managed to acquire the lock twice in a row: then the task didn't have time to clear the buffer and after the mem::swap call, the vec still has 100 elements.
From this point on, vec.len() == 100 will never be true and your code simply appends to the vector in a loop.
Note that this only needs to happen once and then your code never recovers.
The fix to this is to, instead of only trying to clear the vector when it has exactly 100 elements, do so whenever it has at least 100 elements:
if vec.len() >= 100 {

Related

Does idiomatic rust code always avoid 'unsafe'? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I'm doing leetcode questions to get better at solving problems and expressing those solutions in rust, and I've come across a case where it feels like the most natural way of expressing my answer includes unsafe code. Here's what I wrote:
const _0: u8 = '0' as u8;
const _1: u8 = '1' as u8;
pub fn add_binary(a: String, b: String) -> String {
let a = a.as_bytes();
let b = b.as_bytes();
let len = a.len().max(b.len());
let mut result = vec![_0; len];
let mut carry = 0;
for i in 1..=len {
if i <= a.len() && a[a.len() - i] == _1 {
carry += 1
}
if i <= b.len() && b[b.len() - i] == _1 {
carry += 1
}
if carry & 1 == 1 {
result[len - i] = _1;
}
carry >>= 1;
}
if carry == 1 {
result.insert(0, _1);
}
unsafe { String::from_utf8_unchecked(result) }
}
The only usage of unsafe is to do an unchecked conversion of a Vec<u8> to a String, and there is no possibility of causing undefined behaviour in this case because the Vec always just contains some sequence of two different valid ascii characters. I know I could easily do a checked cast and unwrap it, but that feels a bit silly because of how certain I am that the check can never fail. In idiomatic rust, is this bad practice? Should unsafe be unconditionally avoided unless the needed performance can't be achieved without it, or are there exceptions (possibly like this) where it's okay? At risk of making the question too opinionated, what are the rules that determine when it is and isn't okay to use unsafe?
You should avoid unsafe unless there are 2 situations:
You are doing something which impossible to do in safe code e.g. FFI-calls. It is a main reason why unsafe ever exists.
You proved using benchmarks that unsafe provide big speed-up and this code is bottleneck.
Your arguing
I know I could easily do a checked cast and unwrap it, but that feels a bit silly because of how certain I am that the check can never fail.
is valid about current version of your code but you would need to keep this unsafe in mind during all further development.
Unsafe greatly increase cognitive complexity of code. You cannot change any place in your function without keeping unsafe in mind, for example.
I doubt that utf8 validation adds more overhead than possible reallocation in result.insert(0, _1); in your code.
Other nitpicks:
You should add a comment in unsafe section which explains why it is safe. It would make easier to read a code for a other people (or other you after a year of don't touching it).
You could define your constants as const _0: u8 = b'0';

How to use mutexes in threads without Arc? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a function that is supposed to search for primes within a given range. (The algorithm is not important; please ignore the fact that it is very inefficient.)
use std::thread;
use std::sync::Mutex;
use std::convert::TryInto;
/// Takes the search range as (start, end) and outputs a vector with the primes found within
/// that range.
pub fn run(range: (u32, u32)) -> Vec<u32> {
let mut found_primes: Mutex<Vec<u32>> = Mutex::new(Vec::new());
let num_threads: usize = 8;
let num_threads_32: u32 = 8;
let join_handles: Vec<thread::JoinHandle<()>> = Vec::with_capacity(num_threads);
// ERROR: `found_primes` does not live long enough
let vec_ref = &found_primes;
for t in 0..num_threads_32 {
thread::spawn(move || {
let mut n = range.0 + t;
'n_loop: while n < range.1 {
for divisor in 2..n {
if n % divisor == 0 {
n += num_threads_32;
continue 'n_loop;
}
}
// This is the part where I try to add a number to the vector
vec_ref.lock().expect("Mutex was poisoned!").push(n);
n += num_threads_32;
}
println!("Thread {} is done.", t);
});
}
for handle in join_handles {
handle.join();
}
// ERROR: cannot move out of dereference of `std::sync::MutexGuard<'_, std::vec::Vec<u32>>`
*found_primes.lock().expect("Mutex was poisoned!")
}
I managed to get it working with std::sync::mpsc, but I'm pretty sure it can be done just with mutexes. However, the borrow checker doesn't like it.
The errors are in comments. I (think I) understand the first error: the compiler can't prove that &found_primes won't be used within a thread after found_primes is dropped (when the function returns), even though I .join() all the threads before that. I'm guessing I'll need unsafe code to make it work. I don't understand the second error, though.
Can someone explain the errors and tell me how to do this with only Mutexes?
The last error is complaining about trying to move the contents out of the mutex. The .lock() returns a MutexGuard that only yields a reference to the contents. You can't move from it or it would leave the mutex in an invalid state. You can get an owned value by cloning, but that shouldn't be necessary if the mutex is going away anyway.
You can use .into_inner() to consume the mutex and return what was in it.
found_primes.into_inner().expect("Mutex was poisoned!")
See this fix and the other linked fix on the playground.

Why is this looping in Rust so slow? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm new to Rust, as in this is the first code I've written. I'm trying to do some benchmarks for an app we will build against Go, however my Rust POC is ridiculously slow and I'm sure it's because I don't fully understand the language yet. This runs in seconds in Go, but has been running for many minutes in Rust:
use serde_json::{Result, Value};
use std::fs::File;
use std::io::BufReader;
fn rule(data: Value) {
for _i in 0..1000000000 {
let ru = "589ea4b8-99d1-8d05-9358-4c172c10685b";
let s = 0 as usize;
let tl = data["tl"].as_array().unwrap().len();
for i in s..tl {
if data["tl"][i]["t"] == "my_value" && data["tl"][i]["refu"] == ru {
//println!(" t {} matched with reference/ru {}\n", data["tl"][i]["t"], data["tl"][i]["refu"]);
let el = data["el"].as_array().unwrap().len();
for j in s..el {
if data["el"][j]["is_inpatient"] == true && data["el"][j]["eu"] == data["tl"][i]["eu"] {
//println!(" e {} matched.\n", data["el"][j]["eu"]);
}
}
}
}
}
}
fn start() -> Result<()> {
let file = File::open("../../data.json").expect("File should open read only");
let reader = BufReader::new(file);
let v: Value = serde_json::from_reader(reader).expect("JSON was not well-formatted");
//println!("Running rule");
rule(v);
Ok(())
}
fn main() {
let _r = start();
}
1) I know this is super ugly. It's just a speed POC so if it wins I plan to figure out the language in more detail later.
2) The big one: What am I doing wrong here that's causing Rust to perform so slowly?
Build with --release flag. Without it you get unoptimized build with debug checks, and this can be literally 100 times slower. Adding that flag is usually all you need to do to beat Go on execution speed, because Go doesn't have a heavy optimizer like this.
Rust doesn't do anything clever with caching of [] access, so every time you repeat data["tl"] it searches data for "tl". It would be good to cache that search in a variable.
Loops in form for i in 0..len {arr[i]} are the slowest form of loop in Rust. It's faster to use iterators: for item in arr {item}. That's because [i] does an extra bounds check, and iterators don't have to. In your case that's probably a tiny issue, that's more relevant to heavy numeric code.

Recursive function calculating factorials leads to stack overflow

I tried a recursive factorial algorithm in Rust. I use this version of the compiler:
rustc 1.12.0 (3191fbae9 2016-09-23)
cargo 0.13.0-nightly (109cb7c 2016-08-19)
Code:
extern crate num_bigint;
extern crate num_traits;
use num_bigint::{BigUint, ToBigUint};
use num_traits::One;
fn factorial(num: u64) -> BigUint {
let current: BigUint = num.to_biguint().unwrap();
if num <= 1 {
return One::one();
}
return current * factorial(num - 1);
}
fn main() {
let num: u64 = 100000;
println!("Factorial {}! = {}", num, factorial(num))
}
I got this error:
$ cargo run
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
error: Process didn't exit successfully
How to fix that? And why do I see this error when using Rust?
Rust doesn't have tail call elimination, so your recursion is limited by your stack size. It may be a feature for Rust in the future (you can read more about it at the Rust FAQ), but in the meantime you will have to either not recurse so deep or use loops.
Why?
This is a stack overflow which occurs whenever there is no stack memory left. For example, stack memory is used by
local variables
function arguments
return values
Recursion uses a lot of stack memory, because for every recursive call, the memory for all local variables, function arguments, ... has to be allocated on the stack.
How to fix that?
The obvious solution is to write your algorithm in a non-recursive manner (you should do this when you want to use the algorithm in production!). But you can also just increase the stack size. While the stack size of the main thread can't be modified, you can create a new thread and set a specific stack size:
fn main() {
let num: u64 = 100_000;
// Size of one stack frame for `factorial()` was measured experimentally
thread::Builder::new().stack_size(num as usize * 0xFF).spawn(move || {
println!("Factorial {}! = {}", num, factorial(num));
}).unwrap().join();
}
This code works and, when executed via cargo run --release (with optimization!), outputs the solution after only a couple of seconds calculation.
Measuring stack frame size
In case you want to know how the stack frame size (memory requirement for one call) for factorial() was measured: I printed the address of the function argument num on each factorial() call:
fn factorial(num: u64) -> BigUint {
println!("{:p}", &num);
// ...
}
The difference between two successive call's addresses is (more or less) the stack frame size. On my machine, the difference was slightly less than 0xFF (255), so I just used that as size.
In case you're wondering why the stack frame size isn't smaller: the Rust compiler doesn't really optimize for this metric. Usually it's really not important, so optimizers tend to sacrifice this memory requirement for better execution speed. I took a look at the assembly and in this case many BigUint methods were inlined. This means that the local variables of other functions are using stack space as well!
Just as an alternative.. (I do not recommend)
Matts answer is true to an extent. There is a crate called stacker (here) that can artificially increase the stack size for usage in recursive algorithms. It does this by allocating some heap memory to overflow into.
As a word of warning... this takes a very long time to run ... but, it runs, and it doesn't blow the stack. Compiling with optimizations brings it down but its still pretty slow. You're likely to get better perf from a loop as Matt suggests. I thought I would throw this out there anyway.
extern crate num_bigint;
extern crate num_traits;
extern crate stacker;
use num_bigint::{BigUint, ToBigUint};
use num_traits::One;
fn factorial(num: u64) -> BigUint {
// println!("Called with: {}", num);
let current: BigUint = num.to_biguint().unwrap();
if num <= 1 {
// println!("Returning...");
return One::one();
}
stacker::maybe_grow(1024 * 1024, 1024 * 1024, || {
current * factorial(num - 1)
})
}
fn main() {
let num: u64 = 100000;
println!("Factorial {}! = {}", num, factorial(num));
}
I have commented out the debug printlns.. you can uncomment them if you like.

Replacing a borrowed variable [duplicate]

This question already has answers here:
How can I swap in a new value for a field in a mutable reference to a structure?
(2 answers)
Closed 3 years ago.
I have a bucket of objects that need to accumulate values. It's protected by an RwLock, and as such I also keep around its write lock. I want to keep a single write lock for the duration of the process.
For example:
use std::sync::RwLock;
fn main() {
let locked = RwLock::new(Vec::<u32>::new());
// this is the entry point for real-world code
let mut writer = locked.write().unwrap();
// copy into 'locked' until it is full (has 4 items)
for v in 0..100 {
if writer.len() > 4 {
// discard 'writer' and 'locked', create anew
locked = RwLock::new(Vec::<u32>::new());
writer = locked.write().unwrap();
}
writer.push(v);
}
}
While my example operates on fixed data, and so appears to not need the RwLock at all, the real code would enter at "real code" and not necessarily exit on the boundary of locked becoming "full".
How do I create a new locked and writer object when needed without the borrow-checker disagreeing?
I agree with David Grayson, there's no obvious need to recreate the RwLock. Assuming you need the vector after filling it up, use mem::replace to switch out the Vec:
use std::sync::RwLock;
use std::mem;
fn main() {
let locked = RwLock::new(Vec::<u32>::new());
let mut writer = locked.write().unwrap();
for v in 0..100 {
if writer.len() > 4 {
let old_vec = mem::replace(&mut *writer, Vec::new());
}
writer.push(v);
}
}
If you don't need the Vec, then just call Vec::clear.

Resources