Where is Rust storing all these bytes? - rust

In trying to understand how stack memory works, I wrote the following code to display addresses of where data gets stored:
fn main() {
let a = "0123456789abcdef0";
let b = "123456789abcdef01";
let c = "23456789abcdef012";
println!("{:p} {}", &a, a.len());
println!("{:p} {}", &b, b.len());
println!("{:p} {}", &c, c.len());
}
The output is:
0x7fff288a5448 17
0x7fff288a5438 17
0x7fff288a5428 17
It implies that all 17 bytes are stored in a space of 16 bytes, which can't be right. My one guess is that there's some optimization happening, but I get the same results even when I build with --opt-level 0.
The equivalent C seems to do the right thing:
#include <stdio.h>
#include <string.h>
int main() {
char a[] = "0123456789abcdef";
char b[] = "123456789abcdef0";
char c[] = "23456789abcdef01";
printf("%p %zu\n", &a, strlen(a) + 1);
printf("%p %zu\n", &b, strlen(b) + 1);
printf("%p %zu\n", &c, strlen(c) + 1);
return 0;
}
Output:
0x7fff5837b440 17
0x7fff5837b420 17
0x7fff5837b400 17

String literals "..." are stored in static memory, and the variables a, b, c are just (fat) pointers to them. They have type &str, which has the following layout:
struct StrSlice {
data: *const u8,
length: uint
}
where the data field points at the sequence of bytes that form the text, and the length field says how many bytes there are.
On a 64-bit platform this is 16-bytes (and on a 32-bit platform, 8 bytes). The real equivalent in C (ignoring null termination vs. stored length) would be storing into a const char* instead of a char[], changing the C to this prints:
0x7fff21254508 17
0x7fff21254500 17
0x7fff212544f8 17
i.e. the pointers are 8 bytes apart.
You can check these low-level details using --emit=asm or --emit=llvm-ir, or clicking the corresponding button on the playpen (possibly adjusting the optimisation level too). E.g.
fn main() {
let a = "0123456789abcdef0";
}
compiled with --emit=llvm-ir and no optimisations gives (with my trimming and annotations):
%str_slice = type { i8*, i64 }
;; global constant with the string's text
#str1042 = internal constant [17 x i8] c"0123456789abcdef0"
; Function Attrs: uwtable
define internal void #_ZN4main20h55efe3c71b4bb8f4eaaE() unnamed_addr #0 {
entry-block:
;; create stack space for the `a` variable
%a = alloca %str_slice
;; get a pointer to the first element of the `a` struct (`data`)...
%0 = getelementptr inbounds %str_slice* %a, i32 0, i32 0
;; ... and store the pointer to the string data in it
store i8* getelementptr inbounds ([17 x i8]* #str1042, i32 0, i32 0), i8** %0
;; get a pointer to the second element of the `a` struct (`length`)...
%1 = getelementptr inbounds %str_slice* %a, i32 0, i32 1
;; ... and store the length of the string (17) in it.
store i64 17, i64* %1
ret void
}

Related

get byte offset after first char of str in rust

In rust, I want to get the byte offset immediately after of the first character of a str.
Rust Playground
fn main() {
let s: &str = "⚀⚁";
// char is 4 bytes, right??? (not always when in a str)
let offset: usize = 4;
let s1: &str = &s[offset..];
eprintln!("s1 {:?}", s1);
}
The program expectedly crashes with:
thread 'main' panicked at 'byte index 4 is not a char boundary; it is inside '⚁' (bytes 3..6) of `⚀⚁`'
How can find the byte offset for the second char '⚁' ?
Bonus if this can be done safely and without std.
Related:
How to get the byte offset between &str
How to find the starting offset of a string slice of another string?
A char is a 32-bit integer (a unicode scalar value), but individual characters inside a str are variable width UTF-8, as small as a single 8-bit byte.
You can iterate through the characters of the str and their boundaries using str::char_indices, and your code would look like this:
fn main() {
let s: &str = "⚀⚁";
let (offset, _) = s.char_indices().nth(1).unwrap();
dbg!(offset); // 3
let s1: &str = &s[offset..];
eprintln!("s1 {:?}", s1); // s1 "⚁"
}

Why does Rust store i64s captured by a closure as i64*s in the LLVM IR closure environment?

In this simple example
#[inline(never)]
fn apply<F, A, B>(f: F, x: A) -> B
where F: FnOnce(A) -> B {
f(x)
}
fn main() {
let y: i64 = 1;
let z: i64 = 2;
let f = |x: i64| x + y + z;
print!("{}", apply(f, 42));
}
the closure passed to apply is passed as a LLVM IR {i64*, i64*}*:
%closure = type { i64*, i64* }
define internal fastcc i64 #apply(%closure* noalias nocapture readonly dereferenceable(16)) unnamed_addr #0 personality i32 (i32, i32, i64, %"8.unwind::libunwind::_Unwind_Exception"*, %"8.unwind::libunwind::_Unwind_Context"*)* #rust_eh_personality {
entry-block:
%1 = getelementptr inbounds %closure, %closure* %0, i64 0, i32 1
%2 = getelementptr inbounds %closure, %closure* %0, i64 0, i32 0
%3 = load i64*, i64** %2, align 8
%4 = load i64*, i64** %1, align 8
%.idx.val.val.i = load i64, i64* %3, align 8, !noalias !1
%.idx1.val.val.i = load i64, i64* %4, align 8, !noalias !1
%5 = add i64 %.idx.val.val.i, 42
%6 = add i64 %5, %.idx1.val.val.i
ret i64 %6
}
(apply actually has a more complicated name in the generated LLVM code.)
This causes two loads to get to each of the captured variables. Why isn't %closure just {i64, i64} (which would make the argument to apply {i64, i64}*)?
Closures capture by reference by default. You can change that behavior to capture by value by adding the move keyword before the parameter list:
let f = move |x: i64| x + y + z;
This generates much leaner code:
define internal fastcc i64 #apply(i64 %.0.0.val, i64 %.0.1.val) unnamed_addr #0 personality i32 (i32, i32, i64, %"8.unwind::libunwind::_Unwind_Exception"*, %"8.unwind::libunwind::_Unwind_Context"*)* #rust_eh_personality {
entry-block:
%0 = add i64 %.0.0.val, 42
%1 = add i64 %0, %.0.1.val
ret i64 %1
}
Adding the move keyword means that any value that the closure uses will be moved into the closure's environment. In the case of integers, which are Copy, it doesn't make much difference, but in the case of other types like String, it means that you can't use the String anymore in the outer scope after creating the closure. It's an all-or-nothing deal, but you can manually take references to individual variables outside a move closure and have the closure use these references instead of the original values to get manual capture-by-reference behavior.
Can you observe the value vs ref difference somehow in this code?
If you take the address of the captured variable, you can observe the difference. Notice how the first and second output lines are the same, and the third is different.

GMP - mpf_cmp_si not working correctly for negative values

I'm using the GNU multiple precision library through Rust, and I'm trying to write a wrapper for the mpf_sqrt() function.
In order to do so, I need to make sure the number is positive, but mpf_cmp_si() isn't behaving.
EDIT: new example
extern crate libc;
use libc::{c_double, c_int, c_long, c_ulong, c_void,c_char};
use std::mem::uninitialized;
type mp_limb_t = usize; // TODO: Find a way to use __gmp_bits_per_limb instead.
type mp_bitcnt_t = c_ulong;
type mp_exp_t = c_long;
#[link(name = "gmp")]
extern "C" {
fn __gmpf_init2(x: mpf_ptr, prec: mp_bitcnt_t);
fn __gmpf_set_si(rop: mpf_ptr,op: c_int);
fn __gmpf_cmp_si(op1: mpf_srcptr, op2: c_long) -> c_int;
}
#[repr(C)]
pub struct mpf_struct {
_mp_prec: c_int,
_mp_size: c_int,
_mp_exp: mp_exp_t,
_mp_d: *mut c_void
}
pub type mpf_srcptr = *const mpf_struct;
pub type mpf_ptr = *mut mpf_struct;
fn main() {
let mut ten:mpf_struct;
unsafe{
ten = uninitialized();
__gmpf_init2(&mut ten,512);
__gmpf_set_si(&mut ten,10);
}
let mut minus_ten:mpf_struct;
unsafe{
minus_ten = uninitialized();
__gmpf_init2(&mut minus_ten,512);
__gmpf_set_si(&mut minus_ten,-10);
}
// compare things
unsafe{
println!("Result of comparison of -10 (mpf) and 10 (signed int) = {}",
__gmpf_cmp_si(&minus_ten,10));
println!("Result of comparison of -10 (mpf) and 0 (signed int) = {}",
__gmpf_cmp_si(&minus_ten,0));
println!("Result of comparison of 10 (mpf) and 0 (signed int) = {}",
__gmpf_cmp_si(&ten,0));
}
}
This returns:
Running `target/debug/so_test`
Result of comparison of -10 (mpf) and 10 (signed int) = 1
Result of comparison of -10 (mpf) and 0 (signed int) = 1
Result of comparison of 10 (mpf) and 0 (signed int) = 1
According to the docs, this is the behavior:
Function: int mpf_cmp_si (const mpf_t op1, signed long int op2)
Compare op1 and op2. Return a positive value if op1 > op2, zero if op1 = op2, and a negative value if op1 < op2.
I'm running rust 1.4.0, and GMP 6.1.0-1 on x64 Linux
Old code:
pub fn sqrt(self) -> Mpf {
let mut retval:Mpf;
unsafe {
retval = Mpf::new(__gmpf_get_prec(&self.mpf) as usize);
retval.set_from_si(0);
if __gmpf_cmp_ui(&self.mpf,0) > 0 {
__gmpf_sqrt(&mut retval.mpf,&self.mpf);
} else {
panic!("Square root of negative/zero");
}
}
retval
}
the mpf struct is defined like this:
#[repr(C)]
pub struct mpf_struct {
_mp_prec: c_int,
_mp_size: c_int,
_mp_exp: mp_exp_t,
_mp_d: *mut c_void
} and the function from gmp is imported like this:
#[link(name = "gmp")]
extern "C" {
fn __gmpf_cmp_si(op1: mpf_srcptr, op2: c_long) -> c_int;
}
The problem I'm having is that mpf_cmp_si (which is exposed to Rust
as __gmpf_cmp_si) doesn't return negative when it should.
This function should return negative if the value of my mpf is less
than 0. But it doesn't so the function divides by zero and crashes (an
"unknown error", not because of the panic!() call)
The signature of __gmpf_set_si is incorrect. The C definition is:
void mpf_set_si (mpf_t rop, signed long int op)
And hence the Rust FFI declaration should use c_long, not c_int:
fn __gmpf_set_si(rop: mpf_ptr,op: c_long);
Making this change makes the output (plus a few extra prints) as follows:
Result of comparison of -10 (mpf) and -10 (signed int) = 0
Result of comparison of -10 (mpf) and 0 (signed int) = -1
Result of comparison of -10 (mpf) and 10 (signed int) = -1
Result of comparison of 10 (mpf) and -10 (signed int) = 1
Result of comparison of 10 (mpf) and 0 (signed int) = 1
Result of comparison of 10 (mpf) and 10 (signed int) = 0
(NB. adding the comparisons against -10/10 was how I got to the bottom of this: they failed for -10 compared to -10.)
The problem is int and long aren't necessarily the same: on 64-bit platforms, they are typically 32-bits and 64-bits respectively. In either case, the argument is passed in the same (64-bit) register, but the type mismatch means that the register is only initialised with a 32-bit -10, which is very different to a 64-bit one. The bit patterns of each are:
0000000000000000000000000000000011111111111111111111111111110110
1111111111111111111111111111111111111111111111111111111111110110
When interpreted as a signed 64-bit integer (like gmpf_set_si does internally), the first one is 232 - 10 = 4294967286, which is exactly what minus_ten is initialised with:
extern crate libc;
use libc::{c_double, c_int, c_long, c_ulong, c_void,c_char};
use std::mem::uninitialized;
type mp_limb_t = usize; // TODO: Find a way to use __gmp_bits_per_limb instead.
type mp_bitcnt_t = c_ulong;
type mp_exp_t = c_long;
#[link(name = "gmp")]
extern "C" {
fn __gmpf_init2(x: mpf_ptr, prec: mp_bitcnt_t);
fn __gmpf_set_si(rop: mpf_ptr,op: c_int);
fn __gmp_printf(x: *const c_char, ...);
}
#[repr(C)]
pub struct mpf_struct {
_mp_prec: c_int,
_mp_size: c_int,
_mp_exp: mp_exp_t,
_mp_d: *mut c_void
}
pub type mpf_ptr = *mut mpf_struct;
fn main() {
unsafe{
let mut ten = uninitialized();
__gmpf_init2(&mut ten,512);
__gmpf_set_si(&mut ten,10);
let mut minus_ten = uninitialized();
__gmpf_init2(&mut minus_ten,512);
__gmpf_set_si(&mut minus_ten,-10);
__gmp_printf(b"10 == %Ff\n-10 == %Ff\n".as_ptr() as *const c_char,
&ten, &minus_ten);
}
}
Output:
10 == 10.000000
-10 == 4294967286.000000
As a final point, rust-bindgen is a great tool for avoiding errors in mechanical transcriptions like this: it will generate the right things based on a C header.

What happens when a stack-allocated value is boxed?

If we have a value that is already allocated on stack, will boxing copy it to heap and then transfer ownership (that's how it works in .NET, with the exception that both copies will stay alive)? Or will the compiler be "smart" enough to allocate it directly on heap from the beginning?
struct Foo {
x: i32,
}
fn main() {
// a is allocated on stack?
let a = Foo { x: 1 };
// if a is not used, it will be optimized out
println!("{}", a.x);
// what happens here? will the stack allocated structure
// be moved to heap? or was it originally allocated on heap?
let b = Box::new(a);
}
I'm not a specialist in assembler, but this looks like it is actually allocated on stack and then moved: http://pastebin.com/8PzsgTJ1. But I need a confirmation from someone who actually knows what is happening.
It would be pretty strange for this optimization to happen as you describe it. For example, in this code:
let a = Foo { x: 1 };
// operation that observes a
let b = Box::new(a);
// operation that observes b
&a and &b would be equal, which would be surprising. However, if you do something similar, but don't observe a:
#[inline(never)]
fn frobnotz() -> Box<Foo> {
let a = Foo { x: 1 };
Box::new(a)
}
You can see via the LLVM IR that this case was optimized:
define internal fastcc noalias dereferenceable(4) %Foo* #_ZN8frobnotz20h3dca7bc0ee8400bciaaE() unnamed_addr #0 {
entry-block:
%0 = tail call i8* #je_mallocx(i64 4, i32 0)
%1 = icmp eq i8* %0, null
br i1 %1, label %then-block-106-.i.i, label %"_ZN5boxed12Box$LT$T$GT$3new20h2665038481379993400E.exit"
then-block-106-.i.i: ; preds = %entry-block
tail call void #_ZN3oom20he7076b57c17ed7c6HYaE()
unreachable
"_ZN5boxed12Box$LT$T$GT$3new20h2665038481379993400E.exit": ; preds = %entry-block
%2 = bitcast i8* %0 to %Foo*
%x.sroa.0.0..sroa_idx.i = bitcast i8* %0 to i32*
store i32 1, i32* %x.sroa.0.0..sroa_idx.i, align 4
ret %Foo* %2
}
Similarly, you can return the struct on the stack and then box it up, and there will still just be the one allocation:
You may think that this gives us terrible performance: return a value and then immediately box it up ?! Isn't this pattern the worst of both worlds? Rust is smarter than that. There is no copy in this code. main allocates enough room for the box, passes a pointer to that memory into foo as x, and then foo writes the value straight into the Box.
As explained in the official Rust documentation here, Box<T>::new(x: T) allocates memory on the heap and then moves the argument into that memory. Accessing a after let b = Box::new(a) is a compile-time error.

How can I implement a string data type in LLVM?

I have been looking at LLVM lately, and I find it to be quite an interesting architecture. However, looking through the tutorial and the reference material, I can't see any examples of how I might implement a string data type.
There is a lot of documentation about integers, reals, and other number types, and even arrays, functions and structures, but AFAIK nothing about strings. Would I have to add a new data type to the backend? Is there a way to use built-in data types? Any insight would be appreciated.
What is a string? An array of characters.
What is a character? An integer.
So while I'm no LLVM expert by any means, I would guess that if, eg, you wanted to represent some 8-bit character set, you'd use an array of i8 (8-bit integers), or a pointer to i8. And indeed, if we have a simple hello world C program:
#include <stdio.h>
int main() {
puts("Hello, world!");
return 0;
}
And we compile it using llvm-gcc and dump the generated LLVM assembly:
$ llvm-gcc -S -emit-llvm hello.c
$ cat hello.s
; ModuleID = 'hello.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-linux-gnu"
#.str = internal constant [14 x i8] c"Hello, world!\00" ; <[14 x i8]*> [#uses=1]
define i32 #main() {
entry:
%retval = alloca i32 ; <i32*> [#uses=2]
%tmp = alloca i32 ; <i32*> [#uses=2]
%"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0]
%tmp1 = getelementptr [14 x i8]* #.str, i32 0, i64 0 ; <i8*> [#uses=1]
%tmp2 = call i32 #puts( i8* %tmp1 ) nounwind ; <i32> [#uses=0]
store i32 0, i32* %tmp, align 4
%tmp3 = load i32* %tmp, align 4 ; <i32> [#uses=1]
store i32 %tmp3, i32* %retval, align 4
br label %return
return: ; preds = %entry
%retval4 = load i32* %retval ; <i32> [#uses=1]
ret i32 %retval4
}
declare i32 #puts(i8*)
Notice the reference to the puts function declared at the end of the file. In C, puts is
int puts(const char *s)
In LLVM, it is
i32 #puts(i8*)
The correspondence should be clear.
As an aside, the generated LLVM is very verbose here because I compiled without optimizations. If you turn those on, the unnecessary instructions disappear:
$ llvm-gcc -O2 -S -emit-llvm hello.c
$ cat hello.s
; ModuleID = 'hello.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-linux-gnu"
#.str = internal constant [14 x i8] c"Hello, world!\00" ; <[14 x i8]*> [#uses=1]
define i32 #main() nounwind {
entry:
%tmp2 = tail call i32 #puts( i8* getelementptr ([14 x i8]* #.str, i32 0, i64 0) ) nounwind ; <i32> [#uses=0]
ret i32 0
}
declare i32 #puts(i8*)
[To follow up on other answers which explain what strings are, here is some implementation help]
Using the C interface, the calls you'll want are something like:
LLVMValueRef llvmGenLocalStringVar(const char* data, int len)
{
LLVMValueRef glob = LLVMAddGlobal(mod, LLVMArrayType(LLVMInt8Type(), len), "string");
// set as internal linkage and constant
LLVMSetLinkage(glob, LLVMInternalLinkage);
LLVMSetGlobalConstant(glob, TRUE);
// Initialize with string:
LLVMSetInitializer(glob, LLVMConstString(data, len, TRUE));
return glob;
}
Using the C API, instead of using LLVMConstString, you could use LLVMBuildGlobalString. Here is my implementation of
int main() {
printf("Hello World, %s!\n", "there");
return;
}
using C API:
LLVMTypeRef main_type = LLVMFunctionType(LLVMVoidType(), NULL, 0, false);
LLVMValueRef main = LLVMAddFunction(mod, "main", main_type);
LLVMTypeRef param_types[] = { LLVMPointerType(LLVMInt8Type(), 0) };
LLVMTypeRef llvm_printf_type = LLVMFunctionType(LLVMInt32Type(), param_types, 0, true);
LLVMValueRef llvm_printf = LLVMAddFunction(mod, "printf", llvm_printf_type);
LLVMBasicBlockRef entry = LLVMAppendBasicBlock(main, "entry");
LLVMPositionBuilderAtEnd(builder, entry);
LLVMValueRef format = LLVMBuildGlobalStringPtr(builder, "Hello World, %s!\n", "format");
LLVMValueRef value = LLVMBuildGlobalStringPtr(builder, "there", "value");
LLVMValueRef args[] = { format, value };
LLVMBuildCall(builder, llvm_printf, args, 2, "printf");
LLVMBuildRetVoid(builder);
I created strings like so:
LLVMValueRef format = LLVMBuildGlobalStringPtr(builder, "Hello World, %s!\n", "format");
LLVMValueRef value = LLVMBuildGlobalStringPtr(builder, "there", "value");
The generated IR is:
; ModuleID = 'printf.bc'
source_filename = "my_module"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
#format = private unnamed_addr constant [18 x i8] c"Hello World, %s!\0A\00"
#value = private unnamed_addr constant [6 x i8] c"there\00"
define void #main() {
entry:
%printf = call i32 (...) #printf(i8* getelementptr inbounds ([18 x i8], [18 x i8]* #format, i32 0, i32 0), i8* getelementptr inbounds ([6 x i8], [6 x i8]* #value, i32 0, i32 0))
ret void
}
declare i32 #printf(...)
For those using the C++ API of LLVM, you can rely on IRBuilder's CreateGlobalStringPtr :
Builder.CreateGlobalStringPtr(StringRef("Hello, world!"));
This will be represented as i8* in the final LLVM IR.
Think about how a string is represented in common languages:
C: a pointer to a character. You don't have to do anything special.
C++: string is a complex object with a constructor, destructor, and copy constructor. On the inside, it usually holds essentially a C string.
Java/C#/...: a string is a complex object holding an array of characters.
LLVM's name is very self explanatory. It really is "low level". You have to implement strings how ever you want them to be. It would be silly for LLVM to force anyone into a specific implementation.

Resources