Why are string constant pointers different across crates in Rust? - rust

While working with a HashMap that uses &'static str as the key type, I created a newtype to hash by the pointer rather than by the string contents to reduce overhead.
pub struct StaticStr(&'static str);
impl Hash for StaticStr {
fn hash<H: Hasher>(&self, state: &mut H) {
self.0.as_ptr().hash(state)
}
}
impl PartialEq for StaticStr {
fn eq(&self, other: &Self) -> bool {
self.0.as_ptr() == other.0.as_ptr()
}
}
impl Eq for StaticStr {}
It turns out that this does not work consistently, as in the following example.
pub type MyMap = HashMap<StaticStr, u8>;
pub const A: &str = "A";
pub fn make_map() -> MyMap {
let mut map = MyMap::new();
map.insert(StaticStr(A), 1);
map
}
pub fn get_value(control: &MyMap) -> Option<u8> {
control.get(&StaticStr(A)).cloned()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
pub fn map_made_in_lib() {
let map = make_map();
assert_eq!(get_value(&map), Some(1));
}
#[test]
pub fn map_made_in_test() {
// Same as make_map()
let mut map = MyMap::new();
map.insert(StaticStr(A), 1);
// This check fails
assert_eq!(get_value(&map), Some(1));
}
}
Notice that in the first test, the string constant A is only used directly in the lib crate. In the second test, A is used directly in both the lib crate and the test crate. I discovered that although both tests use the same string constant, the pointers are different depending on which crate refers to the string constant by name. This is demonstrated in the minimal reproduction I created. I would have expected that the string literal be included only once for the crate that defines it, or at least that the linker would be smart enough to deduplicate the string literals. Is there a reason for this behavior?

Instead of a const try a static?
A constant item is an optionally named constant value which is not
associated with a specific memory location in the program. Constants
are essentially inlined wherever they are used, meaning that they are
copied directly into the relevant context when used. This includes
usage of constants from external crates, and non-Copy types.
References to the same constant are not necessarily guaranteed to
refer to the same memory address. -- The Rust Reference
A static item is similar to a constant, except that it represents a
precise memory location in the program. All references to the static
refer to the same memory location. Static items have the static
lifetime, which outlives all other lifetimes in a Rust program. Static
items do not call drop at the end of the program. -- The Rust Reference

Related

How to share parts of a string with Rc?

I want to create some references to a str with Rc, without cloning str:
fn main() {
let s = Rc::<str>::from("foo");
let t = Rc::clone(&s); // Creating a new pointer to the same address is easy
let u = Rc::clone(&s[1..2]); // But how can I create a new pointer to a part of `s`?
let w = Rc::<str>::from(&s[0..2]); // This seems to clone str
assert_ne!(&w as *const _, &s as *const _);
}
playground
How can I do this?
While it's possible in principle, the standard library's Rc does not support the case you're trying to create: a counted reference to a part of reference-counted memory.
However, we can get the effect for strings using a fairly straightforward wrapper around Rc which remembers the substring range:
use std::ops::{Deref, Range};
use std::rc::Rc;
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct RcSubstr {
string: Rc<str>,
span: Range<usize>,
}
impl RcSubstr {
fn new(string: Rc<str>) -> Self {
let span = 0..string.len();
Self { string, span }
}
fn substr(&self, span: Range<usize>) -> Self {
// A full implementation would also have bounds checks to ensure
// the requested range is not larger than the current substring
Self {
string: Rc::clone(&self.string),
span: (self.span.start + span.start)..(self.span.start + span.end)
}
}
}
impl Deref for RcSubstr {
type Target = str;
fn deref(&self) -> &str {
&self.string[self.span.clone()]
}
}
fn main() {
let s = RcSubstr::new(Rc::<str>::from("foo"));
let u = s.substr(1..2);
// We need to deref to print the string rather than the wrapper struct.
// A full implementation would `impl Debug` and `impl Display` to produce
// the expected substring.
println!("{}", &*u);
}
There are a lot of conveniences missing here, such as suitable implementations of Display, Debug, AsRef, Borrow, From, and Into — I've provided only enough code to illustrate how it can work. Once supplemented with the appropriate trait implementations, this should be just as usable as Rc<str> (with the one edge case that it can't be passed to a library type that wants to store Rc<str> in particular).
The crate arcstr claims to offer a finished version of this basic idea, but I haven't used or studied it and so can't guarantee its quality.
The crate owning_ref provides a way to hold references to parts of an Rc or other smart pointer, but there are concerns about its soundness and I don't fully understand which circumstances that applies to (issue search which currently has 3 open issues).

Implicit conversion to static lifetime in anyhow::Error

I was reading up code for anyhow crate in Rust. There is a particular line I don't fully grasp:
{
let vtable = &ErrorVTable { ... };
construct(vtable, ...);
}
fn construct(vtable: &'static ErrorVTable, ...);
We seem to create an ErrorVTable struct, and return a reference to it which has lifetime `static. I'd expect compiler to create a struct on function stack, and return a reference to it, causing weird memory issues.
But it appears like compiler detects that this variable for all possible E inferred on compile time and somehow creates static variables for them? How does this actually work?
Consider this simplified code:
struct Foo {
x: i32
}
fn test(_: &'static Foo) {}
fn main() {
let f = Foo{ x: 42 };
test(&f);
}
As expected, it does not compile, with the message:
f does not live long enough
However this slightly variation does compile:
fn main() {
let f = &Foo{ x: 42 };
test(f);
}
The difference is that in the former, the Foo object is local, with a local lifetime, so no 'static reference can be built. But in the latter, the actual object is a static constant so it has static lifetime, and f is just a reference to it.
To help see the difference, consider this other equivalent code:
const F: Foo = Foo{ x: 42 };
fn main() {
test(&F);
}
Or if you use an actual constant literal:
fn test_2(_: &'static i32) {}
fn main() {
let i = &42;
test_2(&i);
}
Naturally, this only works if all the arguments of the Foo construction are constant. If any value is not constant, then the compiler will silently switch to a local temporary instead of a static constant and you will lose the 'static lifetime.
The precise rules for this constant promotion, as it is sometimes called, are a bit complicated and may be extended in newer compiler versions.

Writing a Rust struct type that contains a string and can be used in a constant

I'm getting started with Rust. I want to have a struct that contains (among other things) a string:
#[derive(Clone, Debug)]
struct Foo {
string_field: &str, // won't compile, but suppose String or Box<str> or &'a str or &'static str...
}
And I want to be able to declare constants or statics of it:
static FOO1: Foo = Foo {
string_field: "",
};
And I also want to be able to have it contain a string constructed at runtime:
let foo2 = Foo {
string_field: ("a".to_owned() + "b").as_str(),
};
I could add a lifetime parameter to Foo so that I can declare that the string reference has the same lifetime. That's fine, except that it then seems to require an explicit lifetime parameter for everything that contains a Foo, which means that it complicates the rest of my program (even parts that don't care about being able to use constant expressions).
I could write
enum StringOfAdequateLifetime {
Static(&'static str),
Dynamic(Box<str>), // or String, if you like
}
struct Foo {
string_field: StringOfAdequateLifetime,
}
and that seems to work so far but clutters up writing out literal Foos.
It seems obvious enough that the desired runtime behavior is sound: when you drop a Foo, drop the string it contains — and if it's static it's never dropped, so no extra information is needed to handle the two cases. Is there a clean way to ask Rust for just that?
(It seems like what I could use is some kind of "smart pointer" type to hold the string that can also be written as a constant expression for the static case, but I haven't seen one in the standard library, and when I tried to genericize StringOfAdequateLifetime to apply to any type, I ran into further complications with implementing and using the various standard traits like Deref, which I suspect were due to something about the differences between Sized and non-Sized types.)
The rust standard library has a built-in type for this exact use case, Cow. It's an enum that can represent either a reference or an owned value, and will clone the value if necessary to allow mutable access. In your particular use case, you could define the struct like so:
struct Foo {
string_field: Cow<'static, str>
}
Then you could instantiate it in one of two ways, depending on whether you want a borrowed constant string or an owned runtime-constructed value:
const BORROWED: Foo = Foo { string_field: Cow::Borrowed("some constant") };
let owned = Foo { string_field: Cow::Owned(String::from("owned string")) };
To simplify this syntax, you can define your own constructor functions for the type using a const fn to allow using the borrowed constructor in a constant context:
impl Foo {
pub const fn new_const(value: &'static str) -> Self {
Self { string_field: Cow::borrowed(value) }
}
pub fn new_runtime(value: String) -> Self {
Self { string_field: Cow::Owned(value) }
}
}
This allows you to use a simpler syntax for initializing the values:
const BORROWED: Foo = Foo::new_const("some constant");
let owned = Foo::new_runtime(String::from("owned string"));

How do I declare a static variable as a reference to a hard-coded memory address?

I am working on embedded Rust code for LPC82X series controllers from NXP - the exact toolchain does not matter for the question.
These controllers contain peripheral drivers in ROM. I want to use these drivers, which means I need to use unsafe Rust and FFI without linking actual code.
The ROM APIs expose function pointers packed into C structs at specific address locations. If somebody wants the details of this API, chapter 29 of the LPC82X manual describes the API in question.
My Rust playground dummy sketch looks like this, that would be hidden from application code, by a yet unwritten I2C abstraction lib. This compiles.
#![feature(naked_functions)]
const I2C_ROM_API_ADDRESS: usize = 0x1fff_200c;
static mut ROM_I2C_API: Option<&RomI2cApi> = None;
#[repr(C)]
struct RomI2cApi {
// Dummy functions, real ones take arguments, and have different return
// These won't be called directly, only through the struct's implemented methods
// value
master_transmit_poll: extern "C" fn() -> bool,
master_receive_poll: extern "C" fn() -> bool,
}
impl RomI2cApi {
fn api_table() -> &'static RomI2cApi {
unsafe {
match ROM_I2C_API {
None => RomI2cApi::new(),
Some(table) => table,
}
}
}
unsafe fn new() -> &'static RomI2cApi {
ROM_I2C_API = Some(&*(I2C_ROM_API_ADDRESS as *const RomI2cApi));
ROM_I2C_API.unwrap()
}
#[inline]
fn master_transmit_poll(&self) -> bool {
(self.master_transmit_poll)()
}
#[inline]
fn master_receive_poll(&self) -> bool {
(self.master_receive_poll)()
}
}
impl From<usize> for &'static RomI2cApi {
fn from(address: usize) -> &'static RomI2cApi {
unsafe { &*(address as *const RomI2cApi) }
}
}
fn main() {
let rom_api = unsafe { RomI2cApi::api_table() };
println!("ROM I2C API address is: {:p}", rom_api);
// Should be commented out when trying !
rom_api.master_transmit_poll();
}
I cannot declare the function pointer structs as non-mutable static as statics have many restrictions, including not dereferencing pointers in the assignment. Is there a better workaround than Option? Using Option with the api_table function at least guarantees that initialization happens.
You can get around having a static at all:
const ROM_I2C_API: &RomI2cApi = &*(0x1fff_200c as *const RomI2cApi);
Not yet working, but is planned to work in the future. For now use
const ROM_I2C_API: *const RomI2cApi = 0x1fff_200c as *const RomI2cApi;
fn api_table() -> &'static RomI2cApi {
unsafe { &*(ROM_I2C_API) }
}
This creates a &'static RomI2cApi and allows you to access the functions everywhere directly by calling api_table().master_transmit_poll()

Use Trait as Vec Type

I'm new to Rust and have seen some examples of people using Box to allow pushing many types that implement a certain Trait onto a Vec. When using a Trait with Generics, I have run into an issue.
error[E0038]: the trait `collision::collision_detection::Collidable` cannot be made into an object
--> src/collision/collision_detection.rs:19:5
|
19 | collidables: Vec<Box<Collidable<P, M>>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `collision::collision_detection::Collidable` cannot be made into an object
|
= note: method `get_ncollide_shape` has generic type parameters
error: aborting due to previous error
error: Could not compile `game_proto`.
To learn more, run the command again with --verbose.
Here is my code
extern crate ncollide;
extern crate nalgebra as na;
use self::ncollide::shape::Shape;
use self::ncollide::math::Point;
use self::ncollide::math::Isometry;
use self::na::Isometry2;
pub trait Collidable<P: Point, M> {
fn get_ncollide_shape<T: Shape<P, M>>(&self) -> Box<T>;
fn get_isometry(&self) -> Isometry2<f64>;
}
pub struct CollisionRegistry<P, M>
where
P: Point,
M: Isometry<P>,
{
collidables: Vec<Box<Collidable<P, M>>>,
}
impl<P: Point, M: Isometry<P>> CollisionRegistry<P, M> {
pub fn new() -> Self {
let objs: Vec<Box<Collidable<P, M>>> = Vec::new();
CollisionRegistry { collidables: objs }
}
pub fn register<D>(&mut self, obj: Box<D>)
where
D: Collidable<P, M>,
{
self.collidables.push(obj);
}
}
I'm trying to use collidables as a list of heterogenous game objects that will give me ncollide compatible Shapes back to feed into the collision detection engine.
EDIT:
To clear up some confusion. I'm not trying to construct and return an instance of a Trait. I'm just trying to create a Vec that will allow any instance of the Collidable trait to be pushed onto it.
Rust is a compiled language, so when it compiles your code, it needs to know all of the information it might need to generate machine code.
When you say
trait MyTrait {
fn do_thing() -> Box<u32>;
}
struct Foo {
field: Box<MyTrait>
}
you are telling Rust that Foo will contain a box containing anything implementing MyTrait. By boxing the type, the compiler will erase any additional data about the data type that isn't covered by the trait. These trait objects are implemented as a set of data fields and a table of functions (called a vtable) that contains the functions exposed by the trait, so they can be called.
When you change
fn do_thing() -> Box<u32>;
to
fn do_thing<T>() -> Box<T>;
it may look similar, but the behavior is much different. Let's take a normal function example
fn do_thing<T>(val: T) { }
fn main() {
do_thing(true);
do_thing(45 as u32);
}
the compiler performs what is a called monomorphization, which means your code in the compiler becomes essentially
fn do_thing_bool(val: bool) { }
fn do_thing_num(val: u32) { }
fn main() {
do_thing_bool(true);
do_thing_num(45 as u32);
}
The key thing to realize is that you are asking it to do the same thing for your trait. The problem is that the compiler can't do it. The example above relies on knowing ahead of time that do_thing is called with a number in one case and a boolean in another, and it can know with 100% certainty that those are the only two ways the function is used.
With your code
trait MyTrait {
fn do_thing<T>() -> Box<T>;
}
the compiler does not know what types do_thing will be called with, so it has no way to generate functions you'd need to call. To do that, wherever you convert the struct implementing Collidable into a boxed object it would have to know every possible return type get_ncollide_shape could have, and that is not supported.
Other links for this:
Understanding Traits and Object Safety
https://www.reddit.com/r/rust/comments/3an132/how_to_wrap_a_trait_object_that_has_generic/

Resources