I'm constructing a single Vec<f64> out of smaller Vec<f64>s and can't quite understand which is the right way to do it.
Here are three attempts with commentary on each:
fn main() {
// this vector will be getting changed so is mutable
let mut a:Vec<Vec<f64>> = vec![];
// these won't be getting changed, they gets appended to `a`
let b = vec![0.0, 1.0];
let c = vec![2.0, 3.0];
a.push(b);
a.push(c);
// and then the vector gets flattened into the form that I require:
dbg!(a.concat());
// however, if I want a single level vector constructed from other vectors
let mut d:Vec<f64> = vec![];
// b and c have to be mutable, why is that?
// their contents don't change?
// why can't they just be consumed?
let mut e = vec![0.0, 1.0];
let mut f = vec![2.0, 3.0];
d.append(&mut e);
d.append(&mut f);
dbg!(d);
// another method is to extend from slice
// does this have performance problems compared to the other methods due
// to the copying required?
let mut g:Vec<f64> = vec![];
let h = vec![0.0, 1.0];
let i = vec![2.0, 3.0];
g.extend_from_slice(&h);
g.extend_from_slice(&i);
dbg!(g);
}
Output:
[src/main.rs:15] a.concat() = [
0.0,
1.0,
2.0,
3.0,
]
[src/main.rs:28] d = [
0.0,
1.0,
2.0,
3.0,
]
[src/main.rs:46] g = [
0.0,
1.0,
2.0,
3.0,
]
I'm leaning towards extend_from_slice as it communicates to the reader of the code that vectors h & i will not be changing.
But my question is: is there a performance hit due to the copying? Is there a way to just consume the data when appending without making vectors e & f mutable?
The difference is append will move each item from one Vec to another, extend_from_slice will clone each item.
In terms of performance between the two, in your case there is a slight difference as even though a clone and copy of a f64 is the same, append uses std::ptr::copy_nonoverlapping witch is equivalent to memcpy, and extend_from_slice iterates over and clones each item.
Where performance really matters when the source Vec contains values with dynamic memory and want to avoid the re-allocation that comes with clone (such as String) or the source Vec is large.
The reason the source to append needs to be mutable is because it is mutating its inner state by moving the values out and setting its length to 0.
I would say that concat is for a different use case where the number of source structs is unknown, and is likely to have worse performance due to it needing to iterate over all items.
append() empties the source Vecs, so it needs to take them as &mut. It does not move them because you may still want to use their allocated memory.
In terms of performance, all methods use plain memcpy() - append() because it moves the items and the other methods because they're specialized to use a simple copy for Copy items. However, append() and extend_from_slice() grow the vector twice, for each addition, while concat() calculates the required capacity ahead of time so it is likely to be faster.
Related
I am learning rust by implementing a raytracer. I have a working prototype that is single threaded and I am trying to make it multithreaded.
In my code, I have a sampler which is basically a wrapper around StdRng::seed_from_u64(123) (this will change when I will add different types of samplers) that is mutable because of StdRNG. I need to have a repeatable behaviour that is why i am seeding the random number generator.
In my rendering loop I use the sampler in the following way
let mut sampler = create_sampler(&self.sampler_value);
let sample_count = sampler.sample_count();
println!("Rendering ...");
let progress_bar = get_progress_bar(image.size());
// Generate multiple rays for each pixel in the image
for y in 0..image.size_y {
for x in 0..image.size_x {
image[(x, y)] = (0..sample_count)
.into_iter()
.map(|_| {
let pixel = Vec2::new(x as f32, y as f32) + sampler.next2f();
let ray = self.camera.generate_ray(&pixel);
self.integrator.li(self, &mut sampler, &ray)
})
.sum::<Vec3>()
/ (sample_count as f32);
progress_bar.inc(1);
}
}
When I replace into_iter by par_into_iter the compiler tells me cannot borrow sampler as mutable, as it is a captured variable in a Fn closure
What should I do in this situation?
Thanks!
P.s. If it is of any use, this is the repo : https://github.com/jgsimard/rustrt
Even if Rust wasn't stopping you, you cannot just use a seeded PRNG with parallelism and get a reproducible result out.
Think about it this way: a PRNG with a certain seed/state produces a certain sequence of numbers. Reproducibility (determinism) requires not just that the numbers are the same, but that the way they are taken from the sequence is the same. But if you have multiple threads computing different pixels (different uses) which are racing with each other to fetch numbers from the single PRNG, then the pixels will fetch different numbers on different runs.
In order to get the determinism you want, you must deterministically choose which random number is used for which purpose.
One way to do this would be to make up an “image” of random numbers, computed sequentially, and pass that to the parallel loop. Then each ray has its own random number, which it can use as its seed for another PRNG that only that ray uses.
Another way that can be much more efficient and usable (because it doesn't require any sequentiality at all) is to use hash functions instead of PRNGs. Whenever you want a random number, use a hash function (like those which implement the std::hash::Hasher trait in Rust, but not necessarily the particular one std provides since it's not the fastest) to combine a bunch of information, like
the seed value
the pixel x and y location
which bounce or secondary ray of this pixel you're computing
into a single value which you can use as a pseudorandom number. This way, the “random” results are the same for the same circumstances (because you explicitly specfied that it should be computed from them) even if some other part of the program execution changes (whether that's a code change or a thread scheduling decision by the OS).
Your sampler is not thread-safe, if only because it is a &mut Sampler and mutable references cannot be shared between threads, obviously.
The easy thing would be to wrap it into an Arc<Mutex<Sampler>> and clone it to every closure. Something like (untested):
let sampler = Arc::new(Mutex::new(create_sampler(&self.sampler_value)));
//...
for y in 0..image.size_y {
for x in 0..image.size_x {
image[(x, y)] = (0..sample_count)
.par_into_iter()
.map({
let sampler = Arc::clone(sampler);
move |_| {
let mut sampler = sampler.lock().unwrap();
// use the sampler
}
})
.sum::<Vec3>() //...
But that may not be very efficient way because the mutex will be locked most of the time, and you will kill the paralellism. You may try locking/unlocking the mutex during the ray tracing and see if it improves.
The ideal solution would be to make the Sampler thread-safe and inner mutable, so that the next2f and friends do not need the &mut self part (Sampler::next2f(&self)). Again, the easiest way is having an internal mutex.
Or you can try going lock-less! I mean, your current implementation of that function is:
fn next2f(&mut self) -> Vec2 {
self.current_dimension += 2;
Vec2::new(self.rng.gen(), self.rng.gen())
}
You could replace the current_dimension with an AtomicI32 and the rng with a rand::thread_rng (also untested):
fn next2f(&self) -> Vec2 {
self.current_dimension.fetch_add(2, Ordering::SeqCst);
let mut rng = rand::thread_rng();
Vec2::new(rng.gen(), rng.gen())
}
This is the way I did it. I used this ressource : rust-random.github.io/book/guide-parallel.html. So I used ChaCha8Rng with the set_stream function to get seedable PRNG in parallel. I had to put the image[(x, y)] outside of the iterator because into_par_iter does not allow mutable borrow inside a closure. If you see something dumb in my solution, please tell me!
let size_x = image.size_x;
let img: Vec<Vec<Vec3>> = (0..image.size_y)
.into_par_iter()
.map(|y| {
(0..image.size_x)
.into_par_iter()
.map(|x| {
let mut rng = ChaCha8Rng::seed_from_u64(sampler.seed());
rng.set_stream((y * size_x + x) as u64);
let v = (0..sample_count)
.into_iter()
.map(|_| {
let pixel = Vec2::new(x as f32, y as f32) + sampler.next2f(&mut rng);
let ray = self.camera.generate_ray(&pixel);
self.integrator.li(self, &sampler, &mut rng, &ray)
})
.sum::<Vec3>()
/ (sample_count as f32);
progress_bar.inc(1);
v
}).collect()
}).collect();
for (y, row) in img.into_iter().enumerate() {
for (x, p) in row.into_iter().enumerate() {
image[(x, y)] = p;
}
}
Here is a simple example demonstrating what I'm trying to do:
use std::collections::HashSet;
fn main() {
let mut sets: Vec<HashSet<char>> = vec![];
let mut set = HashSet::new();
set.insert('a');
set.insert('b');
set.insert('c');
set.insert('d');
sets.push(set);
let mut set = HashSet::new();
set.insert('a');
set.insert('b');
set.insert('d');
set.insert('e');
sets.push(set);
let mut set = HashSet::new();
set.insert('a');
set.insert('b');
set.insert('f');
set.insert('g');
sets.push(set);
// Simple intersection of two sets
let simple_intersection = sets[0].intersection(&sets[1]);
println!("Intersection of 0 and 1: {:?}", simple_intersection);
let mut iter = sets.iter();
let base = iter.next().unwrap().clone();
let intersection = iter.fold(base, |acc, set| acc.intersection(set).map(|x| x.clone()).collect());
println!("Intersection of all: {:?}", intersection);
}
This solution uses fold to "accumulate" the intersection, using the first element as the initial value.
Intersections are lazy iterators which iterate through references to the involved sets. Since the accumulator has to have the same type as the first element, we have to clone each set's elements. We can't make a set of owned data from references without cloning. I think I understand this.
For example, this doesn't work:
let mut iter = sets.iter();
let mut base = iter.next().unwrap();
let intersection = iter.fold(base, |acc, set| acc.intersection(set).collect());
println!("Intersection of all: {:?}", intersection);
error[E0277]: a value of type `&HashSet<char>` cannot be built from an iterator over elements of type `&char`
--> src/main.rs:41:73
|
41 | let intersection = iter.fold(base, |acc, set| acc.intersection(set).collect());
| ^^^^^^^ value of type `&HashSet<char>` cannot be built from `std::iter::Iterator<Item=&char>`
|
= help: the trait `FromIterator<&char>` is not implemented for `&HashSet<char>`
Even understanding this, I still don't want to clone the data. In theory it shouldn't be necessary, I have the data in the original vector, I should be able to work with references. That would speed up my algorithm a lot. This is a purely academic pursuit, so I am interested in getting it to be as fast as possible.
To do this, I would need to accumulate in a HashSet<&char>s, but I can't do that because I can't intersect a HashSet<&char> with a HashSet<char> in the closure. So it seems like I'm stuck. Is there any way to do this?
Alternatively, I could make a set of references for each set in the vector, but that doesn't really seem much better. Would it even work? I might run into the same problem but with double references instead.
Finally, I don't actually need to retain the original data, so I'd be okay moving the elements into the accumulator set. I can't figure out how to make this happen, since I have to go through intersection which gives me references.
Are any of the above proposals possible? Is there some other zero copy solution that I'm not seeing?
Finally, I don't actually need to retain the original data.
This makes it really easy.
First, optionally sort the sets by size. Then:
let (intersection, others) = sets.split_at_mut(1);
let intersection = &mut intersection[0];
for other in others {
intersection.retain(|e| other.contains(e));
}
You can do it in a fully lazy way using filter and all:
sets[0].iter().filter (move |c| sets[1..].iter().all (|s| s.contains (c)))
Playground
Finally, I don't actually need to retain the original data, so I'd be okay moving the elements into the accumulator set.
The retain method will work perfectly for your requirements then:
fn intersection(mut sets: Vec<HashSet<char>>) -> HashSet<char> {
if sets.is_empty() {
return HashSet::new();
}
if sets.len() == 1 {
return sets.pop().unwrap();
}
let mut result = sets.pop().unwrap();
result.retain(|item| {
sets.iter().all(|set| set.contains(item))
});
result
}
playground
This question already has answers here:
Efficiently insert or replace multiple elements in the middle or at the beginning of a Vec?
(3 answers)
Closed 5 years ago.
I was expecting a Vec::insert_slice(index, slice) method — a solution for strings (String::insert_str()) does exist.
I know about Vec::insert(), but that inserts only one element at a time, not a slice. Alternatively, when the prepended slice is a Vec one can append to it instead, but this does not generalize. The idiomatic solution probably uses Vec::splice(), but using iterators as in the example makes me scratch my head.
Secondly, the whole concept of prepending has seemingly been exorcised from the docs. There isn't a single mention. I would appreciate comments as to why. Note that relatively obscure methods like Vec::swap_remove() do exist.
My typical use case consists of indexed byte strings.
String::insert_str makes use of the fact that a string is essentially a Vec<u8>. It reallocates the underlying buffer, moves all the initial bytes to the end, then adds the new bytes to the beginning.
This is not generally safe and can not be directly added to Vec because during the copy the Vec is no longer in a valid state — there are "holes" in the data.
This doesn't matter for String because the data is u8 and u8 doesn't implement Drop. There's no such guarantee for an arbitrary T in a Vec, but if you are very careful to track your state and clean up properly, you can do the same thing — this is what splice does!
the whole concept of prepending has seemingly been exorcised
I'd suppose this is because prepending to a Vec is a poor idea from a performance standpoint. If you need to do it, the naïve case is straight-forward:
fn prepend<T>(v: Vec<T>, s: &[T]) -> Vec<T>
where
T: Clone,
{
let mut tmp: Vec<_> = s.to_owned();
tmp.extend(v);
tmp
}
This has a bit higher memory usage as we need to have enough space for two copies of v.
The splice method accepts an iterator of new values and a range of values to replace. In this case, we don't want to replace anything, so we give an empty range of the index we want to insert at. We also need to convert the slice into an iterator of the appropriate type:
let s = &[1, 2, 3];
let mut v = vec![4, 5];
v.splice(0..0, s.iter().cloned());
splice's implementation is non-trivial, but it efficiently does the tracking we need. After removing a chunk of values, it then reuses that chunk of memory for the new values. It also moves the tail of the vector around (maybe a few times, depending on the input iterator). The Drop implementation of Slice ensures that things will always be in a valid state.
I'm more surprised that VecDeque doesn't support it, as it's designed to be more efficient about modifying both the head and tail of the data.
Taking into consideration what Shepmaster said, you could implement a function prepending a slice with Copyable elements to a Vec just like String::insert_str() does in the following way:
use std::ptr;
unsafe fn prepend_slice<T: Copy>(vec: &mut Vec<T>, slice: &[T]) {
let len = vec.len();
let amt = slice.len();
vec.reserve(amt);
ptr::copy(vec.as_ptr(),
vec.as_mut_ptr().offset((amt) as isize),
len);
ptr::copy(slice.as_ptr(),
vec.as_mut_ptr(),
amt);
vec.set_len(len + amt);
}
fn main() {
let mut v = vec![4, 5, 6];
unsafe { prepend_slice(&mut v, &[1, 2, 3]) }
assert_eq!(&v, &[1, 2, 3, 4, 5, 6]);
}
I'm trying to get my head around Rust. I've got an alpha version of 1.
Here's the problem I'm trying to program: I have a vector of floats. I want to set up some threads asynchronously. Each thread should wait for the number of seconds specified by each element of the vector, and return the value of the element, plus 10. The results need to be in input order.
It's an artificial example, to be sure, but I wanted to see if I could implement something simple before moving onto more complex code. Here is my code so far:
use std::thread;
use std::old_io::timer;
use std::time::duration::Duration;
fn main() {
let mut vin = vec![1.4f64, 1.2f64, 1.5f64];
let mut guards: Vec<thread::scoped> = Vec::with_capacity(3);
let mut answers: Vec<f64> = Vec::with_capacity(3);
for i in 0..3 {
guards[i] = thread::scoped( move || {
let ms = (1000.0f64 * vin[i]) as i64;
let d = Duration::milliseconds(ms);
timer::sleep(d);
println!("Waited {}", vin[i]);
answers[i] = 10.0f64 + (vin[i] as f64);
})};
for i in 0..3 {guards[i].join(); };
for i in 0..3 {println!("{}", vin[i]); }
}
So the input vector is [1.4, 1.2, 1.5], and I'm expecting the output vector to be [11.4, 11.2, 11.5].
There appear to be a number of problems with my code, but the first one is that I get a compilation error:
threads.rs:7:25: 7:39 error: use of undeclared type name `thread::scoped`
threads.rs:7 let mut guards: Vec<thread::scoped> = Vec::with_capacity(3);
^~~~~~~~~~~~~~
error: aborting due to previous error
There also seem to be a number of other problems, including using vin within a closure. Also, I have no idea what move does, other than the fact that every example I've seen seems to use it.
Your error is due to the fact that thread::scoped is a function, not a type. What you want is a Vec<T> where T is the result type of the function. Rust has a neat feature that helps you here: It automatically detects the correct type of your variables in many situations.
If you use
let mut guards = Vec::with_capacity(3);
the type of guards will be chosen when you use .push() the first time.
There also seem to be a number of other problems.
you are accessing guards[i] in the first for loop, but the length of the guards vector is 0. Its capacity is 3, which means that you won't have any unnecessary allocations as long as the vector never contains more than 3 elements. use guards.push(x) instead of guards[i] = x.
thread::scoped expects a Fn() -> T, so your closure can return an object. You get that object when you call .join(), so you don't need an answer-vector.
vin is moved to the closure. Therefore in the second iteration of the loop that creates your guards, vin isn't available anymore to be moved to the "second" closure. Every loop iteration creates a new closure.
i is moved to the closure. I have no idea what's going on there. But the solution is to let inval = vin[i]; outside the closure, and then use inval inside the closure. This also solves Point 3.
vin is mutable. Yet you never mutate it. Don't bind variables mutably if you don't need to.
vin is an array of f64. Therefore (vin[i] as f64) does nothing. Therefore you can simply use vin[i] directly.
join moves out of the guard. Since you cannot move out of an array, your cannot index into an array of guards and join the element at the specified index. What you can do is loop over the elements of the array and join each guard.
Basically this means: don't iterate over indices (for i in 1..3), but iterate over elements (for element in vector) whenever possible.
All of the above implemented:
use std::thread;
use std::old_io::timer;
use std::time::duration::Duration;
fn main() {
let vin = vec![1.4f64, 1.2f64, 1.5f64];
let mut guards = Vec::with_capacity(3);
for inval in vin {
guards.push(thread::scoped( move || {
let ms = (1000.0f64 * inval) as i64;
let d = Duration::milliseconds(ms);
timer::sleep(d);
println!("Waited {}", inval);
10.0f64 + inval
}));
}
for guard in guards {
let answer = guard.join();
println!("{}", answer);
};
}
In supplement of Ker's answer: if you really need to mutate arrays within a thread, I suppose the most closest valid solution for your task will be something like this:
use std::thread::spawn;
use std::old_io::timer;
use std::sync::{Arc, Mutex};
use std::time::duration::Duration;
fn main() {
let vin = Arc::new(vec![1.4f64, 1.2f64, 1.5f64]);
let answers = Arc::new(Mutex::new(vec![0f64, 0f64, 0f64]));
let mut workers = Vec::new();
for i in 0..3 {
let worker_vin = vin.clone();
let worker_answers = answers.clone();
let worker = spawn( move || {
let ms = (1000.0f64 * worker_vin[i]) as i64;
let d = Duration::milliseconds(ms);
timer::sleep(d);
println!("Waited {}", worker_vin[i]);
let mut answers = worker_answers.lock().unwrap();
answers[i] = 10.0f64 + (worker_vin[i] as f64);
});
workers.push(worker);
}
for worker in workers { worker.join().unwrap(); }
for answer in answers.lock().unwrap().iter() {
println!("{}", answer);
}
}
In order to share vectors between several threads, I have to prove, that these vectors outlive all of my threads. I cannot use just Vec, because it will be destroyed at the end of main block, and another thread could live longer, possibly accessing freed memory. So I took Arc reference counter, which guarantees, that my vectors will be destroyed only when the counter downs to zero.
Arc allows me to share read-only data. In order to mutate answers array, I should use some synchronize tools, like Mutex. That is how Rust prevents me to make data races.
I have a vector data with size unknown at compile time. I want to create a new vector of the exact that size. These variants don't work:
let size = data.len();
let mut try1: Vec<u32> = vec![0 .. size]; //ah, you need compile-time constant
let mut try2: Vec<u32> = Vec::new(size); //ah, there is no constructors with arguments
I'm a bit frustrated - there is no any information in Rust API, book, reference or rustbyexample.com about how to do such simple base task with vector.
This solution works but I don't think it is good to do so, it is strange to generate elements one by one and I don't have need in any exact values of elements:
let mut temp: Vec<u32> = range(0u32, data.len() as u32).collect();
The recommended way of doing this is in fact to form an iterator and collect it to a vector. What you want is not precisely clear, however; if you want [0, 1, 2, …, size - 1], you would create a range and collect it to a vector:
let x = (0..size).collect::<Vec<_>>();
(range(0, size) is better written (0..size) now; the range function will be disappearing from the prelude soon.)
If you wish a vector of zeroes, you would instead write it thus:
let x = std::iter::repeat(0).take(size).collect::<Vec<_>>();
If you merely want to preallocate the appropriate amount of space but not push values onto the vector, Vec::with_capacity(capacity) is what you want.
You should also consider whether you need it to be a vector or whether you can work directly with the iterator.
You can use Vec::with_capacity() constructor followed by an unsafe set_len() call:
let n = 128;
let v: Vec<u32> = Vec::with_capacity(n);
unsafe { v.set_len(n); }
v[12] = 64; // won't panic
This way the vector will "extend" over the uninitialized memory. If you're going to use it as a buffer it is a valid approach, as long as the type of elements is Copy (primitives are ok, but it will break horribly if the type has a destructor).