If we have a value that is already allocated on stack, will boxing copy it to heap and then transfer ownership (that's how it works in .NET, with the exception that both copies will stay alive)? Or will the compiler be "smart" enough to allocate it directly on heap from the beginning?
struct Foo {
x: i32,
}
fn main() {
// a is allocated on stack?
let a = Foo { x: 1 };
// if a is not used, it will be optimized out
println!("{}", a.x);
// what happens here? will the stack allocated structure
// be moved to heap? or was it originally allocated on heap?
let b = Box::new(a);
}
I'm not a specialist in assembler, but this looks like it is actually allocated on stack and then moved: http://pastebin.com/8PzsgTJ1. But I need a confirmation from someone who actually knows what is happening.
It would be pretty strange for this optimization to happen as you describe it. For example, in this code:
let a = Foo { x: 1 };
// operation that observes a
let b = Box::new(a);
// operation that observes b
&a and &b would be equal, which would be surprising. However, if you do something similar, but don't observe a:
#[inline(never)]
fn frobnotz() -> Box<Foo> {
let a = Foo { x: 1 };
Box::new(a)
}
You can see via the LLVM IR that this case was optimized:
define internal fastcc noalias dereferenceable(4) %Foo* #_ZN8frobnotz20h3dca7bc0ee8400bciaaE() unnamed_addr #0 {
entry-block:
%0 = tail call i8* #je_mallocx(i64 4, i32 0)
%1 = icmp eq i8* %0, null
br i1 %1, label %then-block-106-.i.i, label %"_ZN5boxed12Box$LT$T$GT$3new20h2665038481379993400E.exit"
then-block-106-.i.i: ; preds = %entry-block
tail call void #_ZN3oom20he7076b57c17ed7c6HYaE()
unreachable
"_ZN5boxed12Box$LT$T$GT$3new20h2665038481379993400E.exit": ; preds = %entry-block
%2 = bitcast i8* %0 to %Foo*
%x.sroa.0.0..sroa_idx.i = bitcast i8* %0 to i32*
store i32 1, i32* %x.sroa.0.0..sroa_idx.i, align 4
ret %Foo* %2
}
Similarly, you can return the struct on the stack and then box it up, and there will still just be the one allocation:
You may think that this gives us terrible performance: return a value and then immediately box it up ?! Isn't this pattern the worst of both worlds? Rust is smarter than that. There is no copy in this code. main allocates enough room for the box, passes a pointer to that memory into foo as x, and then foo writes the value straight into the Box.
As explained in the official Rust documentation here, Box<T>::new(x: T) allocates memory on the heap and then moves the argument into that memory. Accessing a after let b = Box::new(a) is a compile-time error.
Related
In the below example:
struct Foo {
a: [u64; 100000],
}
fn foo(mut f: Foo) -> Foo {
f.a[0] = 99999;
f.a[1] = 99999;
println!("{:?}", &mut f as *mut Foo);
for i in 0..f.a[0] {
f.a[i as usize] = 21444;
}
return f;
}
fn main(){
let mut f = Foo {
a:[0;100000]
};
println!("{:?}", &mut f as *mut Foo);
f = foo(f);
println!("{:?}", &mut f as *mut Foo);
}
I find that before and after passing into the function foo, the address of f is different. Why does Rust copy such a big struct everywhere but not actually move it (or achieve this optimization)?
I understand how stack memory works. But with the information provided by ownership in Rust, I think the copy can be avoided. The compiler unnecessarily copies the array twice. Can this be an optimization for the Rust compiler?
A move is a memcpy followed by treating the source as non-existent.
Your big array is on the stack. That's just the way Rust's memory model works: local variables are on the stack. Since the stack space of foo is going away when the function returns, there's nothing else the compiler can do except copy the memory to main's stack space.
In some cases, the compiler can rearrange things so that the move can be elided (source and destination are merged into one thing), but this is an optimization that cannot be relied on, especially for big things.
If you don't want to copy the huge array around, allocate it on the heap yourself, either via a Box<[u64]>, or simply by using Vec<u64>.
This question already has an answer here:
Why is assigning an integer value from a vector to another variable allowed in Rust?
(1 answer)
Closed 2 years ago.
Briefly, some data types are stored on the stack, as the compiler knows how much memory they will require at run time. Other data types are more flexible, and are stored in the heap. The Pointer of the data stays on the stack, pointing to the heap data.
My question is, if the Vec data are on the heap, how is it that i32 (and other normally stack-stored types) can be accessed as if the actually were on the stack (copied by indexing).
In other words. It makes sense to me that I cannot move out String from the Vec, they don't implement Copy and are normally move. The same happens whem they are element of a Vec. However, i32 is normally copied, but why does this happen also when they are part of the vector data on the heap?
Please feel free to point out any conceptual error and point me to existing material if you think I missed someting. I have read The Rust Programming Language and checked around a bit.
fn main() {
// int in stack
let i: i32 = 1;
let _ic = i;
println!("{}", i);
// String on heap
let s: String = String::from("ciao cippina");
let _sc = &s;
println!("{}", s);
// array and data on the stack
let ari = [1, 2, 3];
println!("{:?}", &ari);
println!("a 0 {}", ari[0]);
// array and Pointers on the stack, data on the heap
let ars = [String::from("ciao"), String::from("mondo")];
println!("{:?}", &ars);
println!("a 0 {}", ars[0]);
// let _ars_1 = ars[0]; // ERROR, cannot move out of array
// Vec int, its Pointer on stack, all the rest on heap
let veci = vec![2, 4, 5, 6];
println!("{:?}", &veci);
println!("a 0 {}", veci[0]);
let _veci_1 = veci[0]; // NO ERROR HERE ??
// Vec string, its Pointer on stack, all the rest on heap
let vecs = vec![String::from("ciao"), String::from("mondo")];
println!("{:?}", &vecs);
println!("a 0 {}", vecs[0]);
// let _vecs_1 = vecs[0]; // ERROR, cannot move out of Vec
}
Just because element of a vector lives on a heap doesn't mean that compiler can't know the size of the element. It doesn't matter where element lives, if a type is "copyable", it can be copied from stack -> heap and vice-versa.
In your case, i32 occupies 4 bytes whether on heap or on stack (ignores alignment concerns)
In the below example:
struct Foo {
a: [u64; 100000],
}
fn foo(mut f: Foo) -> Foo {
f.a[0] = 99999;
f.a[1] = 99999;
println!("{:?}", &mut f as *mut Foo);
for i in 0..f.a[0] {
f.a[i as usize] = 21444;
}
return f;
}
fn main(){
let mut f = Foo {
a:[0;100000]
};
println!("{:?}", &mut f as *mut Foo);
f = foo(f);
println!("{:?}", &mut f as *mut Foo);
}
I find that before and after passing into the function foo, the address of f is different. Why does Rust copy such a big struct everywhere but not actually move it (or achieve this optimization)?
I understand how stack memory works. But with the information provided by ownership in Rust, I think the copy can be avoided. The compiler unnecessarily copies the array twice. Can this be an optimization for the Rust compiler?
A move is a memcpy followed by treating the source as non-existent.
Your big array is on the stack. That's just the way Rust's memory model works: local variables are on the stack. Since the stack space of foo is going away when the function returns, there's nothing else the compiler can do except copy the memory to main's stack space.
In some cases, the compiler can rearrange things so that the move can be elided (source and destination are merged into one thing), but this is an optimization that cannot be relied on, especially for big things.
If you don't want to copy the huge array around, allocate it on the heap yourself, either via a Box<[u64]>, or simply by using Vec<u64>.
I'm trying to learn a little about the LLVM IR, particularly what exactly rustc outputs. I'm having a little bit of trouble running even a very simple case.
I put the following in a source file simple.rs:
fn main() {
let x = 7u32;
let y = x + 2;
}
and run rustc --emit llvm-ir simple.rs to get the file simple.ll, containing
; ModuleID = 'simple.cgu-0.rs'
source_filename = "simple.cgu-0.rs"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: uwtable
define internal void #_ZN6simple4main17h8ac50d7470339b75E() unnamed_addr #0 {
start:
br label %bb1
bb1: ; preds = %start
ret void
}
define i64 #main(i64, i8**) unnamed_addr {
top:
%2 = call i64 #_ZN3std2rt10lang_start17ha09816a4e25587eaE(void ()* #_ZN6simple4main17h8ac50d7470339b75E, i64 %0, i8** %1)
ret i64 %2
}
declare i64 #_ZN3std2rt10lang_start17ha09816a4e25587eaE(void ()*, i64, i8**) unnamed_addr
attributes #0 = { uwtable }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"PIE Level", i32 2}
I then try to run this with the command
lli-3.9 -load ~/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/libstd-35ad9950c7e5074b.so simple.ll
but I get the error message
LLVM ERROR: Invalid type for first argument of main() supplied
I'm able to make a minimal reproduction of this as follows: I make a file called s2.ll, containing
define i32 #main(i64, i8**) {
ret i32 42
}
and running lli-3.9 s2.ll gives the same error message. But if I change the contents of s2.ll to
define i32 #main(i32, i8**) {
ret i32 42
}
(i.e. I've changed the type of argc in main) then lli-3.9 s2.ll runs, and echo $? reveals that it did indeed return 42.
I don't think I should have to pass in the i64 explicitly - my argument list or C strings should be put into memory somewhere and the pointer and length passed to main automatically, right? Therefore I assume that I'm doing something wrong in the way in invoke lli - but I have no idea what.
Rust marks its entry point (the function marked with #[start] attribute, by default the function lang_start in the standard library) as taking an argc parameter of type isize. This is a bug because it should have the type of a C int, so it should be 32 bits on a 64-bit platform, but isize is 64 bits. However, due to the way 64-bit calling conventions work, this happens to still work correctly. The same issue also exists for the return type.
A fix for this has been committed on 2017-10-01 and should be present in Rust 1.22.
lli is apparently more strict about checking the type of main which is why it gives the error. But if you use llc instead, it should work correctly.
To get the correct main signature, you can cancel the default main by putting #![no_main] at the top of the module, and provide your own main marked with #[no_mangle]. But note that this will skip the standard library's initialization.
#![no_main]
#[no_mangle]
pub extern fn main(_argc: i32, _argv: *const *const u8) -> i32 {
0
}
See also:
documentation about lang_items and disabling main
Tracking issue for #[start] feature, where some people mention isize not being correct.
In Rust, we can use the Box<T> type to allocate things on the heap. This type is used to safely abstract pointers to heap memory. Box<T> is provided by the Rust standard library.
I was curious about how Box<T> allocation is implemented, so I found its source code. Here is the code for Box<T>::new (as of Rust 1.0):
impl<T> Box<T> {
/// Allocates memory on the heap and then moves `x` into it.
/// [...]
#[stable(feature = "rust1", since = "1.0.0")]
#[inline(always)]
pub fn new(x: T) -> Box<T> {
box x
}
}
The only line in the implementation returns the value box x. This box keyword is not explained anywhere in the official documentation; in fact, it is only mentioned briefly on the std::boxed documentation page.
NOTE: This reply is a bit old. Since it talks about internals and unstable features, things have changed a little bit. The basic mechanism remains the same though, so the answer is still capable of explaining the underlying mechanisms of box.
What does box x usually uses to allocate and free memory?
The answer is the functions marked with lang items exchange_malloc for allocation and exchange_free for freeing. You can see the implementation of those in the default standard library at heap.rs#L112 and heap.rs#L125.
In the end the box x syntax depends on the following lang items:
owned_box on a Box struct to encapsulate the allocated pointer. This struct does not need a Drop implementation, it is implemented automatically by the compiler.
exchange_malloc to allocate the memory.
exchange_free to free the previously allocated memory.
This can be effectively seen in the lang items chapter of the unstable rust book using this no_std example:
#![feature(lang_items, box_syntax, start, no_std, libc)]
#![no_std]
extern crate libc;
extern {
fn abort() -> !;
}
#[lang = "owned_box"]
pub struct Box<T>(*mut T);
#[lang = "exchange_malloc"]
unsafe fn allocate(size: usize, _align: usize) -> *mut u8 {
let p = libc::malloc(size as libc::size_t) as *mut u8;
// malloc failed
if p as usize == 0 {
abort();
}
p
}
#[lang = "exchange_free"]
unsafe fn deallocate(ptr: *mut u8, _size: usize, _align: usize) {
libc::free(ptr as *mut libc::c_void)
}
#[start]
fn main(argc: isize, argv: *const *const u8) -> isize {
let x = box 1;
0
}
#[lang = "stack_exhausted"] extern fn stack_exhausted() {}
#[lang = "eh_personality"] extern fn eh_personality() {}
#[lang = "panic_fmt"] fn panic_fmt() -> ! { loop {} }
Notice how Drop was not implemented for the Box struct? Well let's see the LLVM IR generated for main:
define internal i64 #_ZN4main20hbd13b522fdb5b7d4ebaE(i64, i8**) unnamed_addr #1 {
entry-block:
%argc = alloca i64
%argv = alloca i8**
%x = alloca i32*
store i64 %0, i64* %argc, align 8
store i8** %1, i8*** %argv, align 8
%2 = call i8* #_ZN8allocate20hf9df30890c435d76naaE(i64 4, i64 4)
%3 = bitcast i8* %2 to i32*
store i32 1, i32* %3, align 4
store i32* %3, i32** %x, align 8
call void #"_ZN14Box$LT$i32$GT$9drop.103617h8817b938807fc41eE"(i32** %x)
ret i64 0
}
The allocate (_ZN8allocate20hf9df30890c435d76naaE) was called as expected to build the Box, meanwhile... Look! A Drop method for the Box (_ZN14Box$LT$i32$GT$9drop.103617h8817b938807fc41eE)! Let's see the IR for this method:
define internal void #"_ZN14Box$LT$i32$GT$9drop.103617h8817b938807fc41eE"(i32**) unnamed_addr #0 {
entry-block:
%1 = load i32** %0
%2 = ptrtoint i32* %1 to i64
%3 = icmp ne i64 %2, 2097865012304223517
br i1 %3, label %cond, label %next
next: ; preds = %cond, %entry- block
ret void
cond: ; preds = %entry-block
%4 = bitcast i32* %1 to i8*
call void #_ZN10deallocate20he2bff5e01707ad50VaaE(i8* %4, i64 4, i64 4)
br label %next
}
There it is, deallocate (ZN10deallocate20he2bff5e01707ad50VaaE) being called on the compiler generated Drop!
Notice even on the standard library the Drop trait is not implemented by user-code. Indeed Box is a bit of a magical struct.
Before box was marked as unstable, it was used as a shorthand for calling Box::new. However, it's always been intended to be able to allocate arbitrary types, such as Rc, or to use arbitrary allocators. Neither of these have been finalized, so it wasn't marked as stable for the 1.0 release. This is done to prevent supporting a bad decision for all of Rust 1.x.
For further reference, you can read the RFC that changed the "placement new" syntax and also feature gated it.
box does exactly what Box::new() does - it creates an owned box.
I believe that you can't find implementation of box keyword because currently it is hardcoded to work with owned boxes, and Box type is a lang item:
#[lang = "owned_box"]
#[stable(feature = "rust1", since = "1.0.0")]
#[fundamental]
pub struct Box<T>(Unique<T>);
Because it is a lang item, the compiler has special logic to handle its instantiation which it can link with box keyword.
I believe that the compiler delegates box allocation to functions in alloc::heap module.
As for what box keyword does and supposed to do in general, Shepmaster's answer describes perfectly.