Why does this bevy project take so long to compile and launch? - rust

I started following this tutorial on how to make a game in bevy. The code compiles fine, but it's still pretty slow (I'm honestly not sure if that's normal, it takes around 8 seconds), but when I launch the game, the window goes white (Not Responding) for a few seconds (about the same amount of time as the compile time, maybe a tiny bit less) before properly loading.
Here's my Cargo.toml:
[package]
name = "rustship"
version = "0.1.0"
edition = "2021"
[dependencies]
bevy = "0.8.1"
# Enable a small amount of optimization in debug mode
[profile.dev]
opt-level = 1
# Enable high optimizations for dependencies (incl. Bevy), but not for our code:
[profile.dev.package."*"]
opt-level = 3
[workspace]
resolver = "2"
I tried it with and without the workspace resolver. My rustup toolchain is nightly-x86_64-pc-windows-gnu and I'm using rust-lld to link the program:
[target.nightly-x86_64-pc-windows-gnu]
linker = "rust-lld.exe"
rustflags = ["-Zshare-generics=n"]
According to the official bevy setup guide it should be faster this way. I tried it with rust-lld and without, but it doesn't seem to change anything.
Here's the output of cargo run (with A_NUMBER being a 4-digit number):
AdapterInfo { name: "NVIDIA GeForce RTX 3090", vendor: A_NUMBER, device: A_NUMBER, device_type: DiscreteGpu, backend: Vulkan }
Any ideas on how I can maybe improve the compile time and make the window load directly? My game isn't heavy at all. For now, I'm just loading a sprite. The guy in the tutorial uses MacOS and it seems to be pretty fast for him.

Related

Explicitly setting the inlining threshold vs. using optimization level "s"

If I'm reading the documentation here right, setting opt-level = "s" in Cargo.toml is equivalent to setting the inlining threshold to 75.
So I would expect the following two Cargo.toml snippets to be equivalent:
[profile.release]
opt-level = "s"
cargo-features = ["profile-rustflags"]
...
[profile.release]
rustflags = ["-C", "inline-threshold=75"]
However, the executable size I get with the second version is almost twice the size of the fist version, and it matches the size I get without setting the inline-threshold at all (i.e. using the release build's default of 275).
How do I manually set the inlining threshold to match the behaviour of opt-level = "s"? Yes, I could just use opt-level = "s" itself, but my ultimate goal is to then start tweaking the threshold to see how the performance and the binary size changes.

rust / cargo workspace: how to specify different profile for different sub project

I have a rust Cargo workspace that contains different subproject:
./
├─Cargo.toml
├─project1/
│ ├─Cargo.toml
│ ├─src/
├─project2/
│ ├─Cargo.toml
│ ├─src/
I would like to build one project optimized for binary size and the other for speed.
From my understanding we can tweak the profiles only at the cargo.toml root level so this for instance applies to all my sub-projects.
root Cargo.toml:
[workspace]
members = ["project1", "project2"]
[profile.release]
# less code to include into binary
panic = 'abort'
# optimization over all codebase ( better optimization, slower build )
codegen-units = 1
# optimization for size ( more aggressive )
opt-level = 'z'
# optimization for size
# opt-level = 's'
# link time optimization using using whole-program analysis
lto = true
If I try to apply this configuration in a sub Cargo.toml it doesn't work
Question: is there a way to configure each project independently ?
Thank you in advance.
Edit: Also I forgot to say but one project is build with trunk and is a wasm project (I want to be the smaller possible) the other is a backend and I really need it to be built for speed
Each crate in a workspace can have its own .cargo/config.toml where different profiles can be defined. I've toyed around with this a bit to have one crate for an embedded device, one for a CLI utility to connect to the device over serial, and shared libraries for both of them. Pay attention to the caveat in the docs about needing to be in the crate directory for the config to be read, it won't work from the workspace root.

Rust embedded binary size

I'm new to Rust and after many fights with the compiler and borrow-checker I am finally nearly finished with my first project. But now I have the problem that the binary gets to big to fit into the flash of the microcontroller.
I'm using an STM32F103C8 with 64K flash on a BluePill.
At first I was able to fit the code on the mc and bit by bit I had to enable optimization and such. Now I compile with:
[profile.dev]
codegen-units = 1
debug = 0
lto = true
opt-level = "z"
and am able to fit the binary. opt-level = "s" does generate a to big binary. The error I am getting then is: rust-lld: error: section '.rodata' will not fit in region 'FLASH': overflowed by 606 bytes
As I have under 1000 lines of code and as I would say not so unusual dependencies this seems strange.
There are a few sites like this with ways to minimize the binary. As these are not for embedded most of the ways to minimize are followed anyway.
How am I able to minimize the binary size and am still able to debug it?
My dependencies are:
[dependencies]
cortex-m = "*"
panic-halt = "*"
embedded-hal = "*"
[dependencies.cortex-m-rtfm]
version = "0.4.3"
features = ["timer-queue"]
[dependencies.stm32f1]
version = "*"
features = ["stm32f103", "rt"]
[dependencies.stm32f1xx-hal]
version = "0.4.0"
features = ["stm32f103", "rt"]
Maybe there is a problem as I noticed that cargo build does compile some sub dependencies multiple times in different versions.
Inside the memory.x file:
MEMORY
{
FLASH : ORIGIN = 0x08000000, LENGTH = 64K
RAM : ORIGIN = 0x20000000, LENGTH = 20K
}
Rustc version rustc 1.37.0 (eae3437df 2019-08-13)
edit
The rust panic behavior is abort.
The code is view able under: https://github.com/DarkPhoeniz/rc-switcher-rust
I've run into similar issues and may be able to shed some light on what you can do to reduce the size of the binary you're outputting.
You've already discovered one of them: opt-level = "z". The difference between s and z is the inlining constraint - essentially, the size of a function where the compiler deems it not worth inlining. z specifies this to be 25, s 75. Depending on what you are building, this may or may not be a consequent reduction in size (and it affects .rodata and .text primarily).
Another thing you can play on is the behavior on panic on your code. If I remember correctly, the stm32 target supports both unwind and abort, with unwind enabled on the dev profile. As I'm sure you can understand, unwinding the stack is a large and costly process in terms of code size. As such, setting panic = "abort" in your cargo file might reduce the binary size a bit further.
Beyond that, it is down to manual tuning, and tools like cargo-binutils may be extremely useful for this. Depending on your use case, there may be leftover Debug implementations which are only sporadically needed, and that is definitely something that you could act on.
A few other general tips for shrinking the binary:
First, the cargo-bloat utility is useful for determining what is taking up space in your binary, then you can make informed decisions about how to modify your code to shrink it down.
Second, I've had significant success by configuring the compiler to optimize all dependencies, but leave the top level crate unoptimized for easier debugging. You can do this by adding the following to your Cargo.toml:
# Optimize all dependencies
[profile.dev.package."*"]
opt-level = "z"
If you want to debug a specific dependency (for example: cortex-m-rt), you can make it unoptimized like so:
# Don't optimize the `cortex-m-rt` crate
[profile.dev.package.cortex-m-rt]
opt-level = 0
# Optimize all the other dependencies
[profile.dev.package."*"]
opt-level = "z"

How much memory is consumed by Jemalloc, Debug Symbol and Panic? how to find this ? where it is located?

I am new to RUST as well as for programming. I just wrote LED blinking program on raspberry pi 3 using RUST language. It worked well.
My debug binary file size is 4.7MB. Its really huge. So I released the file and it got reduced to 2.5MB. I found that due to default operation of Jemalloc, Debug symbol and Panic Rust executables are very large. Can somebody help me out how much memory is consumed by Jemalloc, Debug Symbol and Panic? How to find this ? where it is located? How can I remove or deallocate Jemalloc?
I am working with Rust 1.38.0 stable version on Raspberry pi 3 using Visual studio code IDE.
main.rs file
use rust_gpiozero::*;
use std::thread;
use std::time::Duration;
fn main() {
//create a new LEd attached to pin 17
let led = LED::new(17);
//blink the led 5 times
for _ in 0.. 5{
led.on();
thread::sleep(Duration::from_secs(10));
led.off();
thread::sleep(Duration::from_secs(10));
}
}
cargo.toml file
[package]
name = "led_blink"
version = "0.1.0"
authors = ["pi"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
libc = "0.2"
rust_gpiozero = "0.2.0"
[profile.release]
codegen-units = 1
I want to know about how much memory consumed by jemalloc, debug symbol and panic in total size. and how to remove/Deactivate all three operation by default.
Looking for help, Thank you

Under Chisel 3, it takes 10 min to compile the Verilator generated C++ of Rocket Chip. Are there any ways to speed this up?

We are modifying Rocket Chip code. After each modification, we need to run the assembly programs, to be sure everything still runs correctly.
To do this, the steps are:
1) Run Chisel, to generate Verilog
2) Run the verilog through Verilator, to generate C++
3) Compile generated C++
4) Run tests
Step 3 is about 10 times longer than it was under Chisel 2. It takes about 10 minutes, which slows development.
Is there any way to speed this up?
I have found a non-trivial amount of build and run time is spent on not-really-synthesizable constructs that are used for verification support.
For example, I disable the TLMonitors through the Config options. You can find an example in the subsystem Configs.
class WithoutTLMonitors extends Config ((site, here, up) => {
case MonitorsEnabled => false
})

Resources