How do I really disable all rustc optimizations? - rust

I'm trying to learn assembly through compiling Rust. I have found a way to compile Rust code to binary machine code and be able to objdump it to view the assembly. However if I write the following:
#![no_main]
#[link_section = ".text.entry"]
#[no_mangle]
pub extern "C" fn _start() -> ! {
let a: u64 = 4;
let b: u64 = 7;
let c: u64 = a * b;
loop {}
}
The assembly I get is:
0000000000000000 <.data>:
0: 1101 addi sp,sp,-32
2: 4511 li a0,4
4: e42a sd a0,8(sp)
6: 451d li a0,7
8: e82a sd a0,16(sp)
a: 4571 li a0,28
c: ec2a sd a0,24(sp)
e: a009 j 0x10
10: a001 j 0x10
So it looks like rust is collapsing the mul to a constant. I'm using the following compile options:
Cargo.toml:
[profile.dev]
opt-level = 0
mir-opt-level = 0
Is there a way to stop Rust from optimizing this?
The LLVM emitted looks like this:
; Function Attrs: noreturn nounwind
define dso_local void #_start() unnamed_addr #0 section ".text.entry" !dbg !22 {
start:
%c.dbg.spill = alloca i64, align 8
%b.dbg.spill = alloca i64, align 8
%a.dbg.spill = alloca i64, align 8
store i64 4, i64* %a.dbg.spill, align 8, !dbg !36
call void #llvm.dbg.declare(metadata i64* %a.dbg.spill, metadata !28, metadata !DIExpression()), !dbg !37
store i64 7, i64* %b.dbg.spill, align 8, !dbg !38
call void #llvm.dbg.declare(metadata i64* %b.dbg.spill, metadata !31, metadata !DIExpression()), !dbg !39
store i64 28, i64* %c.dbg.spill, align 8, !dbg !40
call void #llvm.dbg.declare(metadata i64* %c.dbg.spill, metadata !33, metadata !DIExpression()), !dbg !41
So it looks like the optimization is before the LLVM pass.
$ rustc --version
rustc 1.60.0-nightly (c5c610aad 2022-02-14)
Command to build:
RUSTFLAGS="--emit=llvm-bc" cargo build --target riscv64imac-unknown-none-elf --no-default-features
build.rs
fn main() {
println!("cargo:rerun-if-changed=build.rs");
println!("cargo:rustc-link-arg=-Tlink.ld");
}
link.ld
ENTRY(_start)
SECTIONS {
.text : { *(.text); *(.text.*) }
}

There is one compiler pass before the generation of LLVM-IR, which is the generation of MIR, the Rust intermediate representation. If you emit this for the given code with a command such as this one:
cargo rustc -- --emit mir
You will see in the .mir file generated that the optimization already took place there.
fn _start() -> ! {
let mut _0: !; // return place in scope 0 at src\main.rs:5:31: 5:32
let _1: u64; // in scope 0 at src\main.rs:6:9: 6:10
scope 1 {
debug a => _1; // in scope 1 at src\main.rs:6:9: 6:10
let _2: u64; // in scope 1 at src\main.rs:7:9: 7:10
scope 2 {
debug b => _2; // in scope 2 at src\main.rs:7:9: 7:10
let _3: u64; // in scope 2 at src\main.rs:8:9: 8:10
scope 3 {
debug c => _3; // in scope 3 at src\main.rs:8:9: 8:10
}
}
}
bb0: {
_1 = const 4_u64; // scope 0 at src\main.rs:6:18: 6:19
_2 = const 7_u64; // scope 1 at src\main.rs:7:18: 7:19
_3 = const 28_u64; // scope 2 at src\main.rs:8:18: 8:23
goto -> bb1; // scope 3 at src\main.rs:10:5: 10:12
}
bb1: {
goto -> bb1; // scope 3 at src\main.rs:10:5: 10:12
}
}
This is happening because the mir-opt-level option currently only exists as an unstable compiler option. It is not available as a profile property in Cargo. Set it manually on a direct call to the compiler:
cargo rustc -- -Z mir-opt-level=0 --emir mir
And this optimization will disappear:
fn _start() -> ! {
let mut _0: !; // return place in scope 0 at src\main.rs:5:31: 5:32
let mut _1: !; // in scope 0 at src\main.rs:5:33: 11:2
let _2: u64; // in scope 0 at src\main.rs:6:9: 6:10
let mut _5: u64; // in scope 0 at src\main.rs:8:18: 8:19
let mut _6: u64; // in scope 0 at src\main.rs:8:22: 8:23
let mut _7: (u64, bool); // in scope 0 at src\main.rs:8:18: 8:23
let mut _8: !; // in scope 0 at src\main.rs:10:5: 10:12
let mut _9: (); // in scope 0 at src\main.rs:5:1: 11:2
scope 1 {
debug a => _2; // in scope 1 at src\main.rs:6:9: 6:10
let _3: u64; // in scope 1 at src\main.rs:7:9: 7:10
scope 2 {
debug b => _3; // in scope 2 at src\main.rs:7:9: 7:10
let _4: u64; // in scope 2 at src\main.rs:8:9: 8:10
scope 3 {
debug c => _4; // in scope 3 at src\main.rs:8:9: 8:10
}
}
}
bb0: {
StorageLive(_1); // scope 0 at src\main.rs:5:33: 11:2
StorageLive(_2); // scope 0 at src\main.rs:6:9: 6:10
_2 = const 4_u64; // scope 0 at src\main.rs:6:18: 6:19
StorageLive(_3); // scope 1 at src\main.rs:7:9: 7:10
_3 = const 7_u64; // scope 1 at src\main.rs:7:18: 7:19
StorageLive(_4); // scope 2 at src\main.rs:8:9: 8:10
StorageLive(_5); // scope 2 at src\main.rs:8:18: 8:19
_5 = _2; // scope 2 at src\main.rs:8:18: 8:19
StorageLive(_6); // scope 2 at src\main.rs:8:22: 8:23
_6 = _3; // scope 2 at src\main.rs:8:22: 8:23
_7 = CheckedMul(_5, _6); // scope 2 at src\main.rs:8:18: 8:23
assert(!move (_7.1: bool), "attempt to compute `{} * {}`, which would overflow", move _5, move _6) -> bb1; // scope 2 at src\main.rs:8:18: 8:23
}
bb1: {
_4 = move (_7.0: u64); // scope 2 at src\main.rs:8:18: 8:23
StorageDead(_6); // scope 2 at src\main.rs:8:22: 8:23
StorageDead(_5); // scope 2 at src\main.rs:8:22: 8:23
StorageLive(_8); // scope 3 at src\main.rs:10:5: 10:12
goto -> bb2; // scope 3 at src\main.rs:10:5: 10:12
}
bb2: {
_9 = const (); // scope 3 at src\main.rs:10:10: 10:12
goto -> bb2; // scope 3 at src\main.rs:10:5: 10:12
}
}
And this is probably as far as you can go without touching LLVM directly. Some optimisations in specific parts of the code can also be prevented through constructs such as black_box.
See also:
Rustc dev guide book on MIR optimizations

Related

Why is it allowed to have both immutable and mutable borrows of a vector of numeric types in one expression?

a is a Vec<i32> which can be mutably and immutably referenced in one expression:
fn main() {
let mut a = vec![0, 1];
a[0] += a[1]; // OK
}
I thought this compiled because i32 implements Copy, so I created another type that implements Copy and compiled it like the first example, but it fails:
use std::ops::AddAssign;
#[derive(Clone, Copy, PartialEq, Debug, Default)]
struct MyNum(i32);
impl AddAssign for MyNum {
fn add_assign(&mut self, rhs: MyNum) {
*self = MyNum(self.0 + rhs.0)
}
}
fn main() {
let mut b = vec![MyNum(0), MyNum(1)];
b[0] += b[1];
}
playground
error[E0502]: cannot borrow `b` as immutable because it is also borrowed as mutable
--> src/main.rs:14:13
|
14 | b[0] += b[1];
| --------^---
| | |
| | immutable borrow occurs here
| mutable borrow occurs here
| mutable borrow later used here
Why does my MyNum not behave in the same way as i32 even though it implements Copy?
Why can the vector be mutably and immutably referenced in one expression?
I believe the thing you're seeing here is that primitive types do not actually call their std::ops equivalents. Those std::ops may just be included for seamless trait extensions, etc. I think the blog post Rust Tidbits: What Is a Lang Item? partially explains this.
I exported the MIR of your example that works with primitive types. I got:
bb5: {
StorageDead(_9); // bb5[0]: scope 1 at src/main.rs:6:8: 6:9
_10 = CheckedAdd((*_8), move _5); // bb5[1]: scope 1 at src/main.rs:6:5: 6:17
assert(!move (_10.1: bool), "attempt to add with overflow") -> [success: bb6, unwind: bb4]; // bb5[2]: scope 1 at src/main.rs:6:5: 6:17
}
I had a lot of difficulty exporting the MIR for the code that was erroring. Outputting MIR without borrow checking is new to me and I couldn't figure out how to do it.
This playground has a very similar thing, but compiles :)
It gives me an actual call to add_assign:
bb3: {
_8 = _9; // bb3[0]: scope 1 at src/main.rs:14:5: 14:9
StorageDead(_10); // bb3[1]: scope 1 at src/main.rs:14:8: 14:9
StorageLive(_11); // bb3[2]: scope 1 at src/main.rs:14:14: 14:22
(_11.0: i32) = const 1i32; // bb3[3]: scope 1 at src/main.rs:14:14: 14:22
// ty::Const
// + ty: i32
// + val: Value(Scalar(0x00000001))
// mir::Constant
// + span: src/main.rs:14:20: 14:21
// + literal: Const { ty: i32, val: Value(Scalar(0x00000001)) }
_7 = const <MyNum as std::ops::AddAssign>::add_assign(move _8, move _11) -> [return: bb5, unwind: bb4]; // bb3[4]: scope 1 at src/main.rs:14:5: 14:22
// ty::Const
// + ty: for<'r> fn(&'r mut MyNum, MyNum) {<MyNum as std::ops::AddAssign>::add_assign}
// + val: Value(Scalar(<ZST>))
// mir::Constant
// + span: src/main.rs:14:5: 14:22
// + literal: Const { ty: for<'r> fn(&'r mut MyNum, MyNum) {<MyNum as std::ops::AddAssign>::add_assign}, val: Value(Scalar(<ZST>)) }
}
How does the primitive case pass the borrow checker? Since add_assign is not called, the immutable reference can be dropped before the mutable reference is required. The MIR simply dereferences the needed location earlier on and passes it through by value.
bb3: {
_5 = (*_6); // bb3[0]: scope 1 at src/main.rs:6:13: 6:17
StorageDead(_7); // bb3[1]: scope 1 at src/main.rs:6:16: 6:17
...
}

What's the difference between var and _var in Rust?

Given this:
fn main() {
let variable = [0; 15];
}
The Rust compiler produces this warning:
= note: #[warn(unused_variables)] on by default
= note: to avoid this warning, consider using `_variable` instead
What's the difference between variable and _variable?
The difference is an underscore at the front, which causes the Rust compiler to allow it to be unused. It is kind of a named version of the bare underscore _ which can be used to ignore a value.
However, _name acts differently than _. The plain underscore drops the value immediately while _name acts like any other variable and drops the value at the end of the scope.
An example of how it does not act exactly the same as a plain underscore:
struct Count(i32);
impl Drop for Count {
fn drop(&mut self) {
println!("dropping count {}", self.0);
}
}
fn main() {
{
let _a = Count(3);
let _ = Count(2);
let _c = Count(1);
}
{
let _a = Count(3);
let _b = Count(2);
let _c = Count(1);
}
}
prints the following (playground):
dropping count 2
dropping count 1
dropping count 3
dropping count 1
dropping count 2
dropping count 3
The key difference between _variable and variable is that first one tells compiler not to give any warnings if we do not use it in our code. Example:
// src/main.rs
fn main() {
let _x = 1;
let y = 2;
}
Compiling main.rs gives:
warning: unused variable: `y`
--> src/main.rs:3:9
|
3 | let y = 2;
| ^ help: if this is intentional, prefix it with an underscore: `_y`
|
= note: `#[warn(unused_variables)]` on by default
The more interesting case is when we are comparing _ with _variable.
Ignoring an Unused Variable by Starting Its Name with _:
The syntax _x still binds the value to the variable, whereas _ doesn’t bind at all.
Consider example:
// src/main.rs
fn main() {
let s = Some(String::from("Hello!"));
if let Some(_s) = s {
println!("found a string");
}
println!("{:?}", s);
}
When we try to compile main.rs we get error:
error[E0382]: borrow of moved value: `s`
--> src/main.rs:8:22
|
4 | if let Some(_s) = s {
| -- value moved here
...
8 | println!("{:?}", s);
| ^ value borrowed here after partial move
|
= note: move occurs because value has type `std::string::String`, which does not implement the `Copy` trait
help: borrow this field in the pattern to avoid moving `s.0`
|
4 | if let Some(ref _s) = s {
| ^^^
Aha! The syntax _x still binds the value to the variable, which means that we are moving the ownership of s to _s, thus, we can no longer access variable s anymore; which happens when we try to print value of s.
The correct way of doing the above is:
// src/main.rs
fn main() {
let s = Some(String::from("Hello!"));
if let Some(_) = s {
println!("found a string");
}
println!("{:?}", s);
}
Above code works just fine. s does not get moved into _, so we can still access it later.
Sometimes I use _ with iterators:
fn main() {
let v = vec![1, 2, 3];
let _ = v
.iter()
.map(|x| {
println!("{}", x);
})
.collect::<Vec<_>>();
}
Compiling gives result:
1
2
3
When doing more complex operations on iterable types above example acts as utility for me.

Why does shadowing not release a borrowed reference?

It seems that shadowing a variable does not release the borrowed reference it holds. The following code does not compile:
fn main() {
let mut a = 40;
let r1 = &mut a;
let r1 = "shadowed";
let r2 = &mut a;
}
With the error message:
error[E0499]: cannot borrow `a` as mutable more than once at a time
--> src/main.rs:5:19
|
3 | let r1 = &mut a;
| - first mutable borrow occurs here
4 | let r1 = "shadowed";
5 | let r2 = &mut a;
| ^ second mutable borrow occurs here
6 | }
| - first borrow ends here
I would expect the code to compile because the first reference r1 is shadowed before borrowing the second reference r2. Obviously, the first borrow lives until the end of the block although it is no longer accessible after line 4. Why is that the case?
TL;DR: Shadowing is about name-lookup, borrowing is about lifetimes.
From a compiler point of view, variables have no name:
fn main() {
let mut __0 = 40;
let __1 = &mut __0;
let __2 = "shadowed";
let __3 = &mut __0;
}
This is not very readable for a human being, so the language allows us to use descriptive names instead.
Shadowing is an allowance on reusing names, which for the lexical scope of the "shadowing" variable will resolve the name to the "shadowing" one (__2 here) instead of the "original" one (__1 here).
However just because the old one can no longer be accessed does not mean it no longer lives: Shadowing != Assignment. This is especially notable with different scopes:
fn main() {
let i = 3;
for i in 0..10 {
}
println!("{}", i);
}
Will always print 3: once the shadowing variable's scope ends, the name resolves to the original again!
It's not like the original r1 ceases to exist after it becomes shadowed; consider the MIR produced for your code without the last line (r2 binding):
fn main() -> () {
let mut _0: (); // return pointer
scope 1 {
let mut _1: i32; // "a" in scope 1 at src/main.rs:2:9: 2:14
scope 2 {
let _2: &mut i32; // "r1" in scope 2 at src/main.rs:3:9: 3:11
scope 3 {
let _3: &str; // "r1" in scope 3 at src/main.rs:4:9: 4:11
}
}
}
bb0: {
StorageLive(_1); // scope 0 at src/main.rs:2:9: 2:14
_1 = const 40i32; // scope 0 at src/main.rs:2:17: 2:19
StorageLive(_2); // scope 1 at src/main.rs:3:9: 3:11
_2 = &mut _1; // scope 1 at src/main.rs:3:14: 3:20
StorageLive(_3); // scope 2 at src/main.rs:4:9: 4:11
_3 = const "shadowed"; // scope 2 at src/main.rs:4:14: 4:24
_0 = (); // scope 3 at src/main.rs:1:11: 5:2
StorageDead(_3); // scope 2 at src/main.rs:5:2: 5:2
StorageDead(_2); // scope 1 at src/main.rs:5:2: 5:2
StorageDead(_1); // scope 0 at src/main.rs:5:2: 5:2
return; // scope 0 at src/main.rs:5:2: 5:2
}
}
Note that when "shadowed" becomes bound (_3), it doesn't change anything related to the original r1 binding (_2); the name r1 no longer applies to the mutable reference, but the original variable still exists.
I wouldn't consider your example a very useful case of shadowing; its usual applications, e.g. bodies of loops, are much more likely to utilize it.

Destruction order involving temporaries in Rust

In C++ (please correct me if wrong), a temporary bound via constant reference is supposed to outlive the expression it is bound to. I assumed the same was true in Rust, but I get two different behaviors in two different cases.
Consider:
struct A;
impl Drop for A { fn drop(&mut self) { println!("Drop A.") } }
struct B(*const A);
impl Drop for B { fn drop(&mut self) { println!("Drop B.") } }
fn main() {
let _ = B(&A as *const A); // B is destroyed after this expression itself.
}
The output is:
Drop B.
Drop A.
This is what you would expect. But now if you do:
fn main() {
let _b = B(&A as *const A); // _b will be dropped when scope exits main()
}
The output is:
Drop A.
Drop B.
This is not what I expected.
Is this intended and if so then what is the rationale for the difference in behavior in the two cases?
I am using Rust 1.12.1.
Temporaries are dropped at the end of the statement, just like in C++. However, IIRC, the order of destruction in Rust is unspecified (we'll see the consequences of this below), though the current implementation seems to simply drop values in reverse order of construction.
There's a big difference between let _ = x; and let _b = x;. _ isn't an identifier in Rust: it's a wildcard pattern. Since this pattern doesn't find any variables, the final value is effectively dropped at the end of the statement.
On the other hand, _b is an identifier, so the value is bound to a variable with that name, which extends its lifetime until the end of the function. However, the A instance is still a temporary, so it will be dropped at the end of the statement (and I believe C++ would do the same). Since the end of the statement comes before the end of the function, the A instance is dropped first, and the B instance is dropped second.
To make this clearer, let's add another statement in main:
fn main() {
let _ = B(&A as *const A);
println!("End of main.");
}
This produces the following output:
Drop B.
Drop A.
End of main.
So far so good. Now let's try with let _b; the output is:
Drop A.
End of main.
Drop B.
As we can see, Drop B is printed after End of main.. This demonstrates that the B instance is alive until the end of the function, explaining the different destruction order.
Now, let's see what happens if we modify B to take a borrowed pointer with a lifetime instead of a raw pointer. Actually, let's go a step further and remove the Drop implementations for a moment:
struct A;
struct B<'a>(&'a A);
fn main() {
let _ = B(&A);
}
This compiles fine. Behind the scenes, Rust assigns the same lifetime to both the A instance and the B instance (i.e. if we took a reference to the B instance, its type would be &'a B<'a> where both 'a are the exact same lifetime). When two values have the same lifetime, then necessarily we need to drop one of them before the other, and as mentioned above, the order is unspecified. What happens if we add back the Drop implementations?
struct A;
impl Drop for A { fn drop(&mut self) { println!("Drop A.") } }
struct B<'a>(&'a A);
impl<'a> Drop for B<'a> { fn drop(&mut self) { println!("Drop B.") } }
fn main() {
let _ = B(&A);
}
Now we're getting a compiler error:
error: borrowed value does not live long enough
--> <anon>:8:16
|
8 | let _ = B(&A);
| ^ does not live long enough
|
note: reference must be valid for the destruction scope surrounding statement at 8:4...
--> <anon>:8:5
|
8 | let _ = B(&A);
| ^^^^^^^^^^^^^^
note: ...but borrowed value is only valid for the statement at 8:4
--> <anon>:8:5
|
8 | let _ = B(&A);
| ^^^^^^^^^^^^^^
help: consider using a `let` binding to increase its lifetime
--> <anon>:8:5
|
8 | let _ = B(&A);
| ^^^^^^^^^^^^^^
Since both the A instance and the B instance have been assigned the same lifetime, Rust cannot reason about the destruction order of these objects. The error comes from the fact that Rust refuses to instantiate B<'a> with the lifetime of the object itself when B<'a> implements Drop (this rule was added as the result of RFC 769 before Rust 1.0). If it was allowed, drop would be able to access values that have already been dropped! However, if B<'a> doesn't implement Drop, then it's allowed, because we know that no code will try to access B's fields when the struct is dropped.
Raw pointers themselves do not carry any sort of lifetime so the compiler might do something like this:
Example:
B is created (so that it can hold an *const A in it)
A is created
B is not bound to a binding and thus gets dropped
A is not needed and thus gets dropped
Let's check out the MIR:
fn main() -> () {
let mut _0: (); // return pointer
let mut _1: B;
let mut _2: *const A;
let mut _3: *const A;
let mut _4: &A;
let mut _5: &A;
let mut _6: A;
let mut _7: ();
bb0: {
StorageLive(_1); // scope 0 at <anon>:8:13: 8:30
StorageLive(_2); // scope 0 at <anon>:8:15: 8:29
StorageLive(_3); // scope 0 at <anon>:8:15: 8:17
StorageLive(_4); // scope 0 at <anon>:8:15: 8:17
StorageLive(_5); // scope 0 at <anon>:8:15: 8:17
StorageLive(_6); // scope 0 at <anon>:8:16: 8:17
_6 = A::A; // scope 0 at <anon>:8:16: 8:17
_5 = &_6; // scope 0 at <anon>:8:15: 8:17
_4 = &(*_5); // scope 0 at <anon>:8:15: 8:17
_3 = _4 as *const A (Misc); // scope 0 at <anon>:8:15: 8:17
_2 = _3; // scope 0 at <anon>:8:15: 8:29
_1 = B::B(_2,); // scope 0 at <anon>:8:13: 8:30
drop(_1) -> bb1; // scope 0 at <anon>:8:31: 8:31
}
bb1: {
StorageDead(_1); // scope 0 at <anon>:8:31: 8:31
StorageDead(_2); // scope 0 at <anon>:8:31: 8:31
StorageDead(_3); // scope 0 at <anon>:8:31: 8:31
StorageDead(_4); // scope 0 at <anon>:8:31: 8:31
StorageDead(_5); // scope 0 at <anon>:8:31: 8:31
drop(_6) -> bb2; // scope 0 at <anon>:8:31: 8:31
}
bb2: {
StorageDead(_6); // scope 0 at <anon>:8:31: 8:31
_0 = (); // scope 0 at <anon>:7:11: 9:2
return; // scope 0 at <anon>:9:2: 9:2
}
}
As we can see drop(_1) is indeed called before drop(_6) as presumed, thus you get the output above.
Example
In this example B is bound to a binding
B is created (for the same reason as above)
A is created
A is not bound and gets dropped
B goes out of scope and gets dropped
The corresponding MIR:
fn main() -> () {
let mut _0: (); // return pointer
scope 1 {
let _1: B; // "b" in scope 1 at <anon>:8:9: 8:10
}
let mut _2: *const A;
let mut _3: *const A;
let mut _4: &A;
let mut _5: &A;
let mut _6: A;
let mut _7: ();
bb0: {
StorageLive(_1); // scope 0 at <anon>:8:9: 8:10
StorageLive(_2); // scope 0 at <anon>:8:15: 8:29
StorageLive(_3); // scope 0 at <anon>:8:15: 8:17
StorageLive(_4); // scope 0 at <anon>:8:15: 8:17
StorageLive(_5); // scope 0 at <anon>:8:15: 8:17
StorageLive(_6); // scope 0 at <anon>:8:16: 8:17
_6 = A::A; // scope 0 at <anon>:8:16: 8:17
_5 = &_6; // scope 0 at <anon>:8:15: 8:17
_4 = &(*_5); // scope 0 at <anon>:8:15: 8:17
_3 = _4 as *const A (Misc); // scope 0 at <anon>:8:15: 8:17
_2 = _3; // scope 0 at <anon>:8:15: 8:29
_1 = B::B(_2,); // scope 0 at <anon>:8:13: 8:30
StorageDead(_2); // scope 0 at <anon>:8:31: 8:31
StorageDead(_3); // scope 0 at <anon>:8:31: 8:31
StorageDead(_4); // scope 0 at <anon>:8:31: 8:31
StorageDead(_5); // scope 0 at <anon>:8:31: 8:31
drop(_6) -> [return: bb3, unwind: bb2]; // scope 0 at <anon>:8:31: 8:31
}
bb1: {
resume; // scope 0 at <anon>:7:1: 9:2
}
bb2: {
drop(_1) -> bb1; // scope 0 at <anon>:9:2: 9:2
}
bb3: {
StorageDead(_6); // scope 0 at <anon>:8:31: 8:31
_0 = (); // scope 1 at <anon>:7:11: 9:2
drop(_1) -> bb4; // scope 0 at <anon>:9:2: 9:2
}
bb4: {
StorageDead(_1); // scope 0 at <anon>:9:2: 9:2
return; // scope 0 at <anon>:9:2: 9:2
}
}
As we can see drop(_6) does get called before drop(_1) so we get the behavior you have seen.

How do I run parallel threads of computation on a partitioned array?

I'm trying to distribute an array across threads and have the threads sum up portions of the array in parallel. I want thread 0 to sum elements 0 1 2 and Thread 1 sum elements 3 4 5. Thread 2 to sum 6 and 7. and Thread 3 to sum 8 and 9.
I'm new to Rust but have coded with C/C++/Java before. I've literally thrown everything and the garbage sink at this program and I was hoping I could receive some guidance.
Sorry my code is sloppy but I will clean it up when it is a finished product. Please ignore all poorly named variables/inconsistent spacing/etc.
use std::io;
use std::rand;
use std::sync::mpsc::{Sender, Receiver};
use std::sync::mpsc;
use std::thread::Thread;
static NTHREADS: usize = 4;
static NPROCS: usize = 10;
fn main() {
let mut a = [0; 10]; // a: [i32; 10]
let mut endpoint = a.len() / NTHREADS;
let mut remElements = a.len() % NTHREADS;
for x in 0..a.len() {
let secret_number = (rand::random::<i32>() % 100) + 1;
a[x] = secret_number;
println!("{}", a[x]);
}
let mut b = a;
let mut x = 0;
check_sum(&mut a);
// serial_sum(&mut b);
// Channels have two endpoints: the `Sender<T>` and the `Receiver<T>`,
// where `T` is the type of the message to be transferred
// (type annotation is superfluous)
let (tx, rx): (Sender<i32>, Receiver<i32>) = mpsc::channel();
let mut scale: usize = 0;
for id in 0..NTHREADS {
// The sender endpoint can be copied
let thread_tx = tx.clone();
// Each thread will send its id via the channel
Thread::spawn(move || {
// The thread takes ownership over `thread_tx`
// Each thread queues a message in the channel
let numTougherThreads: usize = NPROCS % NTHREADS;
let numTasksPerThread: usize = NPROCS / NTHREADS;
let mut lsum = 0;
if id < numTougherThreads {
let mut q = numTasksPerThread+1;
lsum = 0;
while q > 0 {
lsum = lsum + a[scale];
scale+=1;
q = q-1;
}
println!("Less than numToughThreads lsum: {}", lsum);
}
if id >= numTougherThreads {
let mut z = numTasksPerThread;
lsum = 0;
while z > 0 {
lsum = lsum + a[scale];
scale +=1;
z = z-1;
}
println!("Greater than numToughthreads lsum: {}", lsum);
}
// Sending is a non-blocking operation, the thread will continue
// immediately after sending its message
println!("thread {} finished", id);
thread_tx.send(lsum).unwrap();
});
}
// Here, all the messages are collected
let mut globalSum = 0;
let mut ids = Vec::with_capacity(NTHREADS);
for _ in 0..NTHREADS {
// The `recv` method picks a message from the channel
// `recv` will block the current thread if there no messages available
ids.push(rx.recv());
}
println!("Global Sum: {}", globalSum);
// Show the order in which the messages were sent
println!("ids: {:?}", ids);
}
fn check_sum (arr: &mut [i32]) {
let mut sum = 0;
let mut i = 0;
let mut size = arr.len();
loop {
sum += arr[i];
i+=1;
if i == size { break; }
}
println!("CheckSum is {}", sum);
}
So far I've gotten it to do this much. Can't figure out why threads 0 and 1 have the same sum as well as 2 and 3 doing the same thing:
-5
-49
-32
99
45
-65
-64
-29
-56
65
CheckSum is -91
Greater than numTough lsum: -54
thread 2 finished
Less than numTough lsum: -86
thread 1 finished
Less than numTough lsum: -86
thread 0 finished
Greater than numTough lsum: -54
thread 3 finished
Global Sum: 0
ids: [Ok(-86), Ok(-86), Ok(-54), Ok(-54)]
I managed to rewrite it to work with even numbers by using the below code.
while q > 0 {
if id*s+scale == a.len() { break; }
lsum = lsum + a[id*s+scale];
scale +=1;
q = q-1;
}
println!("Less than numToughThreads lsum: {}", lsum);
}
if id >= numTougherThreads {
let mut z = numTasksPerThread;
lsum = 0;
let mut scale = 0;
while z > 0 {
if id*numTasksPerThread+scale == a.len() { break; }
lsum = lsum + a[id*numTasksPerThread+scale];
scale = scale + 1;
z = z-1;
}
Welcome to Rust! :)
Yeah at first I didn't realize each thread gets it's own copy of scale
Not only that! It also gets its own copy of a!
What you are trying to do could look like the following code. I guess it's easier for you to see a complete working example since you seem to be a Rust beginner and asked for guidance. I deliberately replaced [i32; 10] with a Vec since a Vec is not implicitly Copyable. It requires an explicit clone(); we cannot copy it by accident. Please note all the larger and smaller differences. The code also got a little more functional (less mut). I commented most of the noteworthy things:
extern crate rand;
use std::sync::Arc;
use std::sync::mpsc;
use std::thread;
const NTHREADS: usize = 4; // I replaced `static` by `const`
// gets used for *all* the summing :)
fn sum<I: Iterator<Item=i32>>(iter: I) -> i32 {
let mut s = 0;
for x in iter {
s += x;
}
s
}
fn main() {
// We don't want to clone the whole vector into every closure.
// So we wrap it in an `Arc`. This allows sharing it.
// I also got rid of `mut` here by moving the computations into
// the initialization.
let a: Arc<Vec<_>> =
Arc::new(
(0..10)
.map(|_| {
(rand::random::<i32>() % 100) + 1
})
.collect()
);
let (tx, rx) = mpsc::channel(); // types will be inferred
{ // local scope, we don't need the following variables outside
let num_tasks_per_thread = a.len() / NTHREADS; // same here
let num_tougher_threads = a.len() % NTHREADS; // same here
let mut offset = 0;
for id in 0..NTHREADS {
let chunksize =
if id < num_tougher_threads {
num_tasks_per_thread + 1
} else {
num_tasks_per_thread
};
let my_a = a.clone(); // refers to the *same* `Vec`
let my_tx = tx.clone();
thread::spawn(move || {
let end = offset + chunksize;
let partial_sum =
sum( (&my_a[offset..end]).iter().cloned() );
my_tx.send(partial_sum).unwrap();
});
offset += chunksize;
}
}
// We can close this Sender
drop(tx);
// Iterator magic! Yay! global_sum does not need to be mutable
let global_sum = sum(rx.iter());
println!("global sum via threads : {}", global_sum);
println!("global sum single-threaded: {}", sum(a.iter().cloned()));
}
Using a crate like crossbeam you can write this code:
use crossbeam; // 0.7.3
use rand::distributions::{Distribution, Uniform}; // 0.7.3
const NTHREADS: usize = 4;
fn random_vec(length: usize) -> Vec<i32> {
let step = Uniform::new_inclusive(1, 100);
let mut rng = rand::thread_rng();
step.sample_iter(&mut rng).take(length).collect()
}
fn main() {
let numbers = random_vec(10);
let num_tasks_per_thread = numbers.len() / NTHREADS;
crossbeam::scope(|scope| {
// The `collect` is important to eagerly start the threads!
let threads: Vec<_> = numbers
.chunks(num_tasks_per_thread)
.map(|chunk| scope.spawn(move |_| chunk.iter().cloned().sum::<i32>()))
.collect();
let thread_sum: i32 = threads.into_iter().map(|t| t.join().unwrap()).sum();
let no_thread_sum: i32 = numbers.iter().cloned().sum();
println!("global sum via threads : {}", thread_sum);
println!("global sum single-threaded: {}", no_thread_sum);
})
.unwrap();
}
Scoped threads allow you to pass in a reference that is guaranteed to outlive the thread. You can then use the return value of the thread directly, skipping channels (which are great, just not needed here!).
I followed How can I generate a random number within a range in Rust? to generate the random numbers. I also changed it to be the range [1,100], as I think that's what you meant. However, your original code is actually [-98,100], which you could also do.
Iterator::sum is used to sum up an iterator of numbers.
I threw in some rough performance numbers of the thread work, ignoring the vector construction, working on 100,000,000 numbers, using Rust 1.34 and compiling in release mode:
| threads | time (ns) | relative time (%) |
|---------+-----------+-------------------|
| 1 | 33824667 | 100.00 |
| 2 | 16246549 | 48.03 |
| 3 | 16709280 | 49.40 |
| 4 | 14263326 | 42.17 |
| 5 | 14977901 | 44.28 |
| 6 | 12974001 | 38.36 |
| 7 | 13321743 | 39.38 |
| 8 | 13370793 | 39.53 |
See also:
How can I pass a reference to a stack variable to a thread?
All your tasks get a copy of the scale variable. Thread 1 and 2 both do the same thing since each has scale with a value of 0 and modifies it in the same manner as the other thread.
The same goes for Thread 3 and 4.
Rust prevents you from breaking thread safety. If scale were shared by the threads, you would have race conditions when accessing the variable.
Please read about closures, they explain the variable copying part, and about threading which explains when and how you can share variables between threads.

Resources