Rust conditional compilation(cfg) with single identifier - rust

I'm trying to understand the following example from the condition compilation manual # doc.rust-lang.org:
// This function is only included when either foo or bar is defined
#[cfg(any(foo, bar))]
fn needs_foo_or_bar() {
// ...
}
What do those foo and bar identifiers represent?
Is this a shortcut for target_os identifiers or what is it for?

Turns out it had nothing to do with target_os, it is what you set in RUSTFLAGS when you build/run:
RUSTFLAGS='--cfg foo'
Which configuration options are set is determined statically during the compilation of the crate. Certain options are compiler-set based on data about the compilation. Other options are arbitrarily-set, set based on input passed to the compiler outside of the code.
Related question with a bit more advanced example

From the syntax definition on the conditional compilation manual, any(foo, bar) is a ConfigurationPredicate (specifically the ConfigurationAny variation) so foo, bar is a ConfigurationPredicateList and thus foo and bar are each a ConfigurationPredicate. So you could use them to conditionally compile for a target OS. Or you could do some custom features like this:
#[cfg(any(feature = "myfeature1", feature = "myfeature2"))]
pub struct Source {
pub email: String,
pub name: String,
}
For more on custom features, see this question

Related

Rust preprocessor: Plain text replacement like #define for syntax simplification

In C, you can define macros using #define to have something globally replaced by the C preprocessor. How can I achieve the same plain-text replacement in rust?
My goal is to define macros that simplifies Rust syntax and reduce boilerplate, such as replacing !!. with .unwrap(). to simulate the "non-null assertion operator" syntax in Kotlin, Swift, or TypeScript, and replacing (\w+)\?\? with Option<$1> to simplify optional type syntax.
For example, the following struct
struct ReturnPath {
name: Option<String>,
file_type: Option<String>,
mtime: Option<i64>
}
would be much shorter in the processed syntax:
struct ReturnPath {
name: String??,
file_type: String??,
mtime: i64??
}
How can I achieve the same plain-text replacement in rust?
You can't do this without a DSL macro, essentially.
C macros are token-replacement and happen before the syntax-parsing step -- Rust macros are more akin to Lisp macros, which happen during the parsing step. See: https://softwareengineering.stackexchange.com/a/28387
replacing (\w+)?? with Option<$1> to simplify optional type syntax.
This also won't do what you want it to do in more complicated scenarios, such as if you have Result<T, R>?? or Vec<&'static str>?? or something like that. This is because Rust's grammar is more powerful than regexes -- it is mostly context-free with some context-sensitive parts (link).
What you're probably looking for is some sort of declarative and syntactical replacement from ($t:ty) \? \? => Option<$t>.
In addition, this regex can also probably match on non-types like a?? if a: Option<Option<T>>.
If you wanted to create a special syntax for this, you'd probably do something like:
macro_rules! shorthand_struct_fields {
/* rows of idents to types and then adjust struct definition but with T?? substituted for Option<T> */
}
struct Foo {
shorthand_struct_fields! {
x: i32??,
}
}
I'm not sure if this is such a good idea, but it you do want to create this large macro, you would be best suited by writing a declarative macros with sub-macros that use # rules, like so: http://danielkeep.github.io/tlborm/book/pat-internal-rules.html

How *exactly* does rust look up modules?

What is the exact set of rules that rust uses to look up a module from a file?
Every explanation I have found online about modules says, "this is the purpose of modules, here is an example of one, ..." None give the complete, comprehensive, 100% accurate explanation for how rust looks up modules. Even the rust reference doesn't tell you whether both the crate root and the importing file need to declare mod! There is no simple ruleset I can use to tell whether it will work.
I'm looking for something I can follow, like:
Rust looks at the name, parsing :: like subdir::subdir::name
Rust looks to see if there is a file name.rs in the same directory and name/mod.rs
There is not allowed to be both a name.rs and a name/mod.rs.
Then, Rust...???
This is best explained starting from inline modules. Modules are arranged into a hierarchy from the crate root. Every crate, after some desugaring, looks something like this:
// root
pub mod a {
pub mod b {
pub const X: u8 = 1;
}
}
mod foo {
}
Referring to an item in the tree is pretty simple:
:: goes "down" a level
super:: goes "up" a level
crate:: goes to the root level
Examples for referring to X:
a::b::X from the crate root
crate::a::b::X from anywhere in the crate
super::a::b::X from within module foo
b::X from within module a
mod a; is really just syntax sugar for either of the following:
#[path = "foo.rs"]
mod foo;
// or
#[path = "foo/mod.rs"]
mod foo;
Which further desugar to:
mod foo {
include!("foo.rs");
}
// or
mod foo {
include!("foo/mod.rs");
}
If foo.rs (or foo/mod.rs) contains a mod bar; then the whole tree would look like:
mod foo {
mod bar {
// contents of `bar.rs` (or `foo/bar/mod.rs`)
}
// remaining contents of `foo.rs`
}
Please note that the usage of mod.rs, while still supported, is discouraged. Instead, it's recommended to use foo.rs for crate::foo and place any submodules of foo in the foo/ directory.
crate:: just always corresponds to the root being compiled at the time. If your crate is sufficiently complex or doesn't follow convention, then certain crate::... item paths can refer to different things in different files. But confusion is easily avoidable by following conventions.

Apparent unused variable in match statement

I am implementing a simple library system to keep track of my pdfs.
I have a Subject enum and a Entry struct defined as follows:
pub enum Subject {
Math,
Programming,
CompSci,
Language,
Misc,
None
}
pub struct Entry {
pub subject: Subject
}
I am trying to implement a function that will operate on a vector of Entry's and return a Vec<&Entry> whose entries match a given Subject.
I have a simple Library struct that is a wrapper around a Vec<Entry>:
pub struct Library {
pub entries: Vec<Entry>
}
In order to do so, I need to iterate through entries and filter only the elements whose .subject field correspond to the desired subject. To accomplish this I have created a function that will return a predicate function.
Here is the get_subject function:
impl Library {
pub fn get_subject(&self, subject: Subject) -> Vec<&Entry> {
let pred = subject_pred(subject);
self.entries.iter().filter(pred).collect::<Vec<&Entry>>()
}
}
which calls the function subject_pred to create the correct predicate function:
// Return a PREDICATE that returns true when
// the passed ENTRY matches the desired SUBJECT
fn subject_pred(subject_UNUSED: Subject) -> impl FnMut(&&Entry) -> bool {
|e: &&Entry| if matches!(&e.subject, subject_UNUSED) {
true
} else {
false
}
}
Here's the problem. This syntax compiles just fine but apparently the subject_UNUSED local variable in subject_pred is "unused". I am flabbergasted as my syntax clearly shows intent to match with the passed subject_UNUSED. When I test out this function on a vector of entries, the predicate always returns true (hence why I am receiving the "unused" warning) but I have literally no idea why.
If anyone could explain why the match statement is always matched, that would be greatly appreciated. I tried using a regular match statement but the same warning is popping up, and this is not the behavior that I am trying to code. If I don't include the subject_UNUSED in a traditional match statement, the compiler tells me that I have to cover the Math, Programming, CompSci, Language, Misc and None variants of my enum, which indicates to me that everything up until that point is good.
You cannot match against a variable. What you've done is equivalent to
matches!(&e.subject, some_subject)
That matches any Subject, just like a wildcard (_), except it also captures it in the some_subject variable (can be used in a guard like matches!(&e.subject, subject_UNUSED if subject_UNUSED == ...)). Neither the captured variable nor the parameter (which is shadowed by it) are used.
What you need to do is to #[derive(PartialEq)] then use ==:
if e.subject == subject_UNUSED { ... }
By the way, your code also has other problems: you don't move into the closure and you're taking owned entries but produce borrowed.

How does Rust respect the Copy trait?

If you make a struct derive the Copy trait then Rust is going to make y as a copy of x in the code below, as opposed to moving from x to y otherwise:
#[derive(Debug, Copy, Clone)]
struct Foo;
let x = Foo;
let y = x;
If I were in C++ I'd say that Copy somehow makes Foo implement the = operator in a way that it copies the entire object on the right side.
In Rust, is it simply implemented as a rule in the compiler? When the compiler finds let y=x it simply checks if the Copy trait is derived or not and decides if copy or moves?
I'm intersted in Rust internals so I can understand the language better. This information can't be found on tutorials.
Yes, this is directly implemented in the compiler.
It affects any situation that would otherwise move the item, so it also affects passing parameters to functions or matching in a match expression – basically any situation that involves pattern matching. In that way, it's not really comparable to implementing the = operator in C++.
The definition of the Copy trait is marked as a "lang" item in the source code of the standard library. The compiler knows that the item marked with #[lang = "copy"] is the trait that decides whether a type is moved or copied. The compiler also knows some ruls about types that are implicitly Copy, like closures or tuples that only contain items that are Copy.
In Rust, is it simply implemented as a rule in the compiler? When the compiler finds let y=x it simply checks if the Copy trait is derived or not and decides if copy or moves?
At runtime, there is no semantic difference (though the applicable optimisations might vary), both move and copy are just a memcopy, and in either case the copy can be optimised away.
At compile-time, the compile is indeed aware of the Copy/!Copy distinction: in the case where x would be a !Copy type the assignment "consumes" the variable, meaning you can't use it afterwards.
If the item is Copy then it doesn't and you can.
That's about it.
If you want to dig into how this code is compiled, you could take a look at the MIR representation in the playground. In this slightly simplified version:
#[derive(Copy, Clone)]
struct Foo;
fn main() {
let x = Foo;
let y = x;
}
the slightly trimmed MIR output is:
bb0: {
StorageLive(_1);
_1 = const Scalar(<ZST>): Foo;
StorageLive(_2);
_2 = const Scalar(<ZST>): Foo;
StorageDead(_2);
StorageDead(_1);
return;
}
So in this specific case, the compiler has determined that Foo is a Zero Sized Type (ZST) for x and y (here, _1 and _2), so both are assigned a constant empty value, so there isn't any copying as-such.
To see it in the playground, click here then select "MIR" from the drop down triple dots button just to the right of the "Run" button. For more in depth information about MIR, take a look at the rustc dev guide.

How many lines are covered by the Rust conditional compilation attribute?

I'm trying to use a conditional compilation statement. Beyond defining a function that should only exist in a debug build, I want to define a set of variables/constants/types that only exist in the debug build.
#[cfg(debug)]
pub type A = B;
pub type B = W;
#[cfg(other_option)]
pub type A = Z;
pub type B = I;
let test = 23i32;
How many lines are actually "covered" by the conditional compile attribute in this case? Is it only one (what I would expect in this context)? Are there ways to ensure that a whole block of code (including variables, types and two functions) is covered by the condition?
You can use a module to group together everything that should exist for debug/release only, like this:
#[cfg(debug)]
mod example {
pub type A = i32;
pub type B = i64;
}
#[cfg(not(debug))]
mod example {
pub type A = u32;
pub type B = u64;
}
fn main() {
let x: example::A = example::A::max_value();
println!("{}", x);
}
Playground link (note that this will always print the not(debug) value because the playground doesn't define the debug feature, even in debug mode).
If debug is defined, this will print 2147483647 (the maximum value of an i32), otherwise it will print 4294967295 (the maximum value of a u32). Keep in mind that both modules must have definitions for each item, otherwise you'll hit a compile-time error.
If you've not read about Attributes, it might be a good idea to do so; make sure you know the difference between inner attributes (#![attribute]) and outer attributes (#[attribute]).
An #[attribute] only applies to the next item. Please see the Rust book.
Edit: I don't think it is currently possible to spread an attribute over an arbitrary number of declarations.
Additional, in-depth information on attributes and their application can be found at Rust reference.

Resources