Why do curly braces for struct instantiation not define a scope? - rust

I have the following code I am surprised works:
struct S<'a> {
value: &'a String,
}
fn foo(s: &S) {
println!("{}", s.value);
}
#[allow(dead_code)]
fn main() {
let s = S {
value: &String::from("ABC"),
};
foo(&s);
}
If I see a pair of curly braces, I imagine them as a scope. So for me, the line S { value: &String::from("ABC") }; and, more importantly, the part between the curly braces represents a scope. Inside this scope, an anonymous string is created and a reference to it is taken. After the brace is closed, the string should be destroyed and the next line foo(&s) should tell me something about lifetimes, but this is not the case! Why?

In Rust, curly braces for structs do not denote a scope, simply the fields of a struct. It's like how you can use curly braces when invoking a macro, but those curly braces do not create a scope. If they do create a scope, it would be incredibly inconvenient, as you couldn't do something like &String::from("ABC").

Related

Do structs inside Box::new need curly braces

Do empty structures inside Box::new() need curly braces? If not, is there a preferred style?
struct Empty;
// are these equivalent?
fn get_empty_box() -> Box<Empty> {
Box::new(Empty)
}
fn get_empty_box_alt() -> Box<Empty> {
Box::new(Empty {})
}
Depends on how they are defined.
If they are defined as struct Foo {} then yes.
If they are defined as struct Foo(); then they need parentheses or braces, but it's uncommon to see them with braces.
If they're defined as struct Foo; then they don't need braces (although they can accept them) and usually instantiated without braces.
The technical reason for that is that struct Foo; defines, in addition to the struct itself, a constant with the same name that contains an instance of the struct. That is, const Foo: Foo = Foo {};. When you spell Foo without braces you just copy this constant.
In a similar fashion, tuple structs (struct Foo();) define, in addition to the struct itself, a function that instantiates it: fn Foo() -> Foo { Foo {} }.

Why do I need angle brackets in <$a> when implementing macro based on type?

I can implement a macro taking a type like this:
trait Boundable<A> {
fn max_value() -> A;
}
impl Boundable<u8> for u8 {
fn max_value() -> u8 { u8::MAX }
}
When I turn the impl into a macro, why do I need to surround the type itself with angle brackets, as in this?
macro_rules! impl_boundable {
($a:ty) => {
impl Boundable<$a> for $a {
fn max_value() -> $a { <$a>::MAX }
}
};
}
impl_boundable!(i8);
In particular, <$a>::MAX. Without it, the compiler gives me error missing angle brackets in associated item path. It puzzles me why the macro code needs to be different from the non-macro code.
playground
The syntax is _path_::item, not _type_::item. Valid paths include identifiers and <T> for types T.
In u8::MAX, the u8 is allowed because it is an identifier, not because it is a type. [u8; 1]::item is not allowed.
If your macro takes $a:ident, instead of $a:ty, it will work as is with types which are also identifiers like u8. But, accepting a type $a:ty, the generic way to make a path from a type is with angle brackets <$a>.
It is also an option for your macro to accept a path directly: $a:path. But you are likely to encounter bug #48067: the parser cannot figure out how to compose a path from smaller path segments. There is a workaround for this case in the ticket of use $a as base; base::MAX.

Writing a Rust struct type that contains a string and can be used in a constant

I'm getting started with Rust. I want to have a struct that contains (among other things) a string:
#[derive(Clone, Debug)]
struct Foo {
string_field: &str, // won't compile, but suppose String or Box<str> or &'a str or &'static str...
}
And I want to be able to declare constants or statics of it:
static FOO1: Foo = Foo {
string_field: "",
};
And I also want to be able to have it contain a string constructed at runtime:
let foo2 = Foo {
string_field: ("a".to_owned() + "b").as_str(),
};
I could add a lifetime parameter to Foo so that I can declare that the string reference has the same lifetime. That's fine, except that it then seems to require an explicit lifetime parameter for everything that contains a Foo, which means that it complicates the rest of my program (even parts that don't care about being able to use constant expressions).
I could write
enum StringOfAdequateLifetime {
Static(&'static str),
Dynamic(Box<str>), // or String, if you like
}
struct Foo {
string_field: StringOfAdequateLifetime,
}
and that seems to work so far but clutters up writing out literal Foos.
It seems obvious enough that the desired runtime behavior is sound: when you drop a Foo, drop the string it contains — and if it's static it's never dropped, so no extra information is needed to handle the two cases. Is there a clean way to ask Rust for just that?
(It seems like what I could use is some kind of "smart pointer" type to hold the string that can also be written as a constant expression for the static case, but I haven't seen one in the standard library, and when I tried to genericize StringOfAdequateLifetime to apply to any type, I ran into further complications with implementing and using the various standard traits like Deref, which I suspect were due to something about the differences between Sized and non-Sized types.)
The rust standard library has a built-in type for this exact use case, Cow. It's an enum that can represent either a reference or an owned value, and will clone the value if necessary to allow mutable access. In your particular use case, you could define the struct like so:
struct Foo {
string_field: Cow<'static, str>
}
Then you could instantiate it in one of two ways, depending on whether you want a borrowed constant string or an owned runtime-constructed value:
const BORROWED: Foo = Foo { string_field: Cow::Borrowed("some constant") };
let owned = Foo { string_field: Cow::Owned(String::from("owned string")) };
To simplify this syntax, you can define your own constructor functions for the type using a const fn to allow using the borrowed constructor in a constant context:
impl Foo {
pub const fn new_const(value: &'static str) -> Self {
Self { string_field: Cow::borrowed(value) }
}
pub fn new_runtime(value: String) -> Self {
Self { string_field: Cow::Owned(value) }
}
}
This allows you to use a simpler syntax for initializing the values:
const BORROWED: Foo = Foo::new_const("some constant");
let owned = Foo::new_runtime(String::from("owned string"));

Is there any different semantics between "(1..4)" and "{1..4}" iteration in Rust?

I find using (1..4)
fn main() {
for v in (1..4) {
println!("{}", v);
}
}
and {1..4}
fn main() {
for v in {1..4} {
println!("{}", v);
}
}
gets the same result. Is there any different semantics between "(1..4)" and "{1..4}" iteration?
They produce the same iterators. You can even omit parentheses/braces:
fn main() {
for v in 1..4 {
println!("{}", v);
}
}
You can enclose an expression with () or {} in general. There is a difference however: {} creates a block and you can write statements (like let) in it. There is also a very subtle difference in how expressions are parsed. Edit: I found a blog article that describes another difference in how coercion and borrowck works.
Usually () is preferred if you don't need statements.
There's no real useful difference. Both parenthesis and braces count as a single expression and function to alter the precedence. I'm pretty sure they have slightly different parsing rules, but at that point I'd guess there's a cleaner way of writing the code.
Note that in your examples, the idiomatic way would be to use neither:
fn main() {
for v in 1..4 {
println!("{}", v);
}
}
When needed, I feel I've only ever seen parenthesis used, never braces:
fn main() {
println!("{}", (1..4).count());
}
There are rare cases where curly braces provide more power. Since they serve to start a new scope, you can use them to "manually" transfer ownership in some tricky locations. For the purposes of the simple iterator described, there won't be any visible difference.
In addition to the existing answers I was interested what the difference would be in the mid-level IR.
Even though the braces introduce a new block, in this case there is virtually no difference even in the (Nightly) MIR - the compiler immediately recognizes that the block serves no other purpose than returning a Range.

Lifetime of references in closures

I need a closure to refer to parts of an object in its enclosing environment. The object is created within the environment and is scoped to it, but once created it could be safely moved to the closure.
The use case is a function that does some preparatory work and returns a closure that will do the rest of the work. The reason for this design are execution constraints: the first part of the work involves allocation, and the remainder must do no allocation. Here is a minimal example:
fn stage_action() -> Box<Fn() -> ()> {
// split a freshly allocated string into pieces
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
// the returned closure refers to the subtrings vector of
// slices without any further allocation or modification
Box::new(move || {
for sub in substrings.iter() {
println!("{}", sub);
}
})
}
fn main() {
let action = stage_action();
// ...executed some time later:
action();
}
This fails to compile, correctly stating that &string[0..1] and others must not outlive string. But if string were moved into the closure, there would be no problem. Is there a way to force that to happen, or another approach that would allow the closure to refer to parts of an object created just outside of it?
I've also tried creating a struct with the same functionality to make the move fully explicit, but that doesn't compile either. Again, compilation fails with the error that &later[0..1] and others only live until the end of function, but "borrowed value must be valid for the static lifetime".
Even completely avoiding a Box doesn't appear to help - the compiler complains that the object doesn't live long enough.
There's nothing specific to closures here; it's the equivalent of:
fn main() {
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
let string = string;
}
You are attempting to move the String while there are outstanding borrows. In my example here, it's to another variable; in your example it's to the closure's environment. Either way, you are still moving it.
Additionally, you are trying to move the substrings into the same closure environment as the owning string. That's makes the entire problem equivalent to Why can't I store a value and a reference to that value in the same struct?:
struct Environment<'a> {
string: String,
substrings: Vec<&'a str>,
}
fn thing<'a>() -> Environment<'a> {
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
Environment {
string: string,
substrings: substrings,
}
}
The object is created within the environment and is scoped to it
I'd disagree; string and substrings are created outside of the closure's environment and moved into it. It's that move that's tripping you up.
once created it could be safely moved to the closure.
In this case that's true, but only because you, the programmer, can guarantee that the address of the string data inside the String will remain constant. You know this for two reasons:
String is internally implemented with a heap allocation, so moving the String doesn't move the string data.
The String will never be mutated, which could cause the string to reallocate, invalidating any references.
The easiest solution for your example is to simply convert the slices to Strings and let the closure own them completely. This may even be a net benefit if that means you can free a large string in favor of a few smaller strings.
Otherwise, you meet the criteria laid out under "There is a special case where the lifetime tracking is overzealous" in Why can't I store a value and a reference to that value in the same struct?, so you can use crates like:
owning_ref
use owning_ref::RcRef; // 0.4.1
use std::rc::Rc;
fn stage_action() -> impl Fn() {
let string = RcRef::new(Rc::new(String::from("a:b:c")));
let substrings = vec![
string.clone().map(|s| &s[0..1]),
string.clone().map(|s| &s[2..3]),
string.clone().map(|s| &s[4..5]),
];
move || {
for sub in &substrings {
println!("{}", &**sub);
}
}
}
fn main() {
let action = stage_action();
action();
}
ouroboros
use ouroboros::self_referencing; // 0.2.3
fn stage_action() -> impl Fn() {
#[self_referencing]
struct Thing {
string: String,
#[borrows(string)]
substrings: Vec<&'this str>,
}
let thing = ThingBuilder {
string: String::from("a:b:c"),
substrings_builder: |s| vec![&s[0..1], &s[2..3], &s[4..5]],
}
.build();
move || {
thing.with_substrings(|substrings| {
for sub in substrings {
println!("{}", sub);
}
})
}
}
fn main() {
let action = stage_action();
action();
}
Note that I'm no expert user of either of these crates, so these examples may not be the best use of it.

Resources