cfg attribute with arbitrary constant expression - rust

I have the following const:
const IS_WSL: bool = is_wsl!();
and I'd like to be able to use this with the cfg attibute to perform conditional compilation. Something like:
#[cfg(const = "IS_WSL")] // what goes here?
const DOWNLOLADS: &'static str = "/mnt/c/Users/foo/Downloads";
#[cfg(not(const = "IS_WSL"))]
const DOWNLOADS: &'static str = "/home/foo/Downloads";
Obviously this syntax doesn't work, but is there any way to achieve what I'm describing?
I'm aware of custom rustc flags, but would like to avoid doing that, since there's a fair amount of logic that I'd rather not try to write in bash

The answer is not. You have to use something like build script to achieve that.
It cannot work because cfg-expansion occurs at an earlier pass in the compiler than constant evaluation.
cfg expansion works at the same time as macro expansion. Both can affect name resolution (macros can create new names, which other macros, or even the same macro, can later refer to) which forces us to use a fixed-point algorithm (resolve names, expand macros, resolve names, expand macros... until no more names can be resolved, i.e. a fixed point was reached). const evaluation takes a place after type checking, sometimes (with generic_const_exprs) even during codegen. If it could affect macro expansion, we would have a giant fixed-point loop resolve names - expand macros - resolve names - expand macros... until a fixed point is reached, then lower to HIR - type-check - evaluate constants (or even lower to MIR - monomorphize and evaluate constants) - and back to name resolution. Besides slowing the compiler down a lot, it'll also make it significantly more complex, not really something the rustc team wants.

In your specific case, since both cfg variants declare a static with the same name and type you can just match on IS_WSL:
const IS_WSL: bool = is_wsl!();
const DOWNLOADS: &'static str = match IS_WSL {
true => "/mnt/c/Users/foo/Downloads",
false => "/home/foo/Downloads",
};
Playground
This doesn't have the same power as cfg does, but it is still useful if you just need to select two values of the same type.

Related

Anchor `declare_id!` with Environment Variable - Environment Variables in Macros

In anchor the "top" of any program features a declare_id!() statement. The input to that statement is a Pubkey, typically in the form of a hard coded string. 12Factor Methodology typically dictates that hard coded configuration values like this should be avoided. However trying to not hard code the value has me pulling out my hair.
declare_id!(std::env::var("VARIABLE_NAME"));
Does not work because the env::var call executes at runtime while the macro executes at compile time.
declare_id!(env!("VARIABLE_NAME"));
Does not work because env! returns a &str.
declare_id!(Pubkey::from_str(env!("VARIABLE_NAME")));
Does not work because Pubkey::from_str can fail and as such returns a Result
declare_id!(Pubkey::from_str(env!("VARIABLE_NAME")).unwrap());
Does not work because declare_id! requires a constant and constants cannot be made from unwrap (Its probably more nuanced than that, but I'm new to rust) and ? fails for the same reason.
How would I go about defining an environment variable within a macro?
Given the lack of resources on the topic, I'm presuming an environment variable in this case is not a best practice. Why should one not use an environment variable in this case?
How can one accomplish the injection of program id into an anchor application if environment variables are not the way to do so?
Bonus points:
env::var() returns a Result<&str>
env::var_os() returns a Result<&OsStr>
env!() returns a &str
env_os!() does not exist in core
How do you handle an OsStr environment variable at build time? Why would you not need to be able to?
Working with env vars seems troublesome, since most of the ways of creating a Pubkey aren't const fn. To me, this seems like an issue with the anchor-lang crate, usually env vars aren't this troublesome.
That said, you could write the key to a config file somewhere, and read that in using include_bytes, and pass that to Pubkey::new_from_array, which is const:
// this works because concat! expands to a string literal
const BYTES: &[u8; 32] = include_bytes!(concat!(env!("CARGO_MANIFEST_DIR"), "/key"));
anchor_lang::declare_id!(Pubkey::new_from_array(*BYTES));
If you don't want to store the file in the repo, you can point include_bytes!() at any location, even one read from an env var. For example, you could have:
const BYTES: &[u8; 32] = include_bytes!(env!("ANCHOR_KEY_LOCATION"));
anchor_lang::declare_id!(Pubkey::new_from_array(*BYTES));
which will read the key bytes from a file at $ANCHOR_KEY_LOCATION

Dynamically generating arguments Clap

I'm trying to figure out how to dynamically generating arguments from input arguments with Clap.
What I'm trying to emulate with Clap is the following python code:
parser = argparse.ArgumentParser()
parser.add_argument("-i", type=str, nargs="*")
(input_args, additional_args) = parser.parse_known_args()
for arg in input_args:
parser.add_argument(f'--{arg}-bar', required=true, type=str)
additional_config = parser.parse_args(additional_args)
So that you can do the following in your command:
./foo.py -i foo bar baz --foo-bar foo --bar-bar bar --baz-bar bar
and have the additional arguments be dynamically generated from the first arguments. Not sure if it's possible to do in Clap but I assumed it was maybe possible due to the Readme stating you could use the builder pattern to dynamically generate arguments[1].
So here is my naive attempt of trying to do this.
use clap::{Arg, App};
fn main() {
let mut app = App::new("foo")
.arg(Arg::new("input")
.short('i')
.value_name("INPUT")
.multiple(true)
.required(true));
let matches = app.get_matches_mut();
let input: Vec<_> = matches.values_of("input").unwrap().collect()
for i in input {
app.arg(Arg::new(&*format!("{}-bar", i)).required(true))
}
}
Which does not obviously having the compiler scream at you for both !format lifetime and app.arg I'm mostly interesting in solving how I could generate new arguments to the app which then could be matched against again. I'm quite new to rust so it's quite possible this is not possible with Clap.
[1] https://github.com/clap-rs/clap
I assumed it was maybe possible due to the Readme stating you could use the builder pattern to dynamically generate arguments[1].
Dynamically generating argument means that, you can .arg with runtime values and it'll work fine (aka the entire CLI doesn't need to be fully defined at compile-time, this distinction doesn't exist in Python as everything is done at runtime).
What you're doing here is significantly more complicated (and specialised, and odd) as you're passing through unknown parameters then re-parsing them.
Now first of all, you literally can't reuse App in clap: most of its methods (very much including get_matches) take self and therefore "consume" the App and return something else, either the original App or a result. Although you can clone the original App before you get_matches it I guess.
But I don't think that's useful here: though I have not tried it should be possible do do what you want using TrailingVarArg: this would collect all trailing arguments into a single positional arg slice (you will probably need AllowLeadingHyphen as well), then you can create a second App with dynamically generated parameters in order to parse that sub-set of arguments (get_matches_from will parse from an iterator rather than the env args, this is useful for testing... or for this exact sort of situations).

How can I store a format string template outside of my source code?

When translating, messages can be in different languages and have format parameters. I want to be able to do this where the template can be stored in a file:
static PATTERN: &'static str = r"Hello {inner};";
/// in some implementation
fn any_method(&self) -> String {
format!(PATTERN, inner = "world");
}
That's not possible. Format strings must be actual literal strings.
The next best approach would be some kind of dynamic string format library. Or, failing that, you could always use str::replace if your needs aren't too complex.
This is definitely possible and trivial to do using the include_str macro that has been available in the standard library since version 1.0.0. The following example was tested with rustc 1.58.1:
Contents of src/main.rs:
println!(include_str!("src/hello-world.tmpl"), "world");
Contents of src/hello-world.tmpl
Hello {inner}
This works because include_str injects the contents of the template file as a string literal before println, format, and friends have a chance to evaluate their arguments. This approach only works when the format template you want to include is available during macro expansion - like it is in your example. If it's not, then you should consider other options like the ones suggested by #DK.
As an added bonus: You can also define format strings in source code locations other than the site where they are used by defining them as macros.

difference between variable definition in a Haskell source file and in GHCi?

In a Haskell source file, I can write
a = 1
and I had the impression that I have to write the same in GHCi as
let a = 1
, for a = 1 in GHCi gives a parse error on =.
Now, if I write
a = 1
a = 2
in a source file, I will get an error about Multiple declaration of a, but it is OK to write in GHCi:
let a = 1
let a = 2
Can someone help clarify the difference between the two styles?
Successive let "statements" in the interactive interpreter are really the equivalent of nested let expressions. They behave as if there is an implied in following the assignment, and the rest of the interpreter session comprises the body of the let. That is
>>> let a = 1
>>> let a = 1
>>> print a
is the same as
let a = 1 in
let a = 1 in
print a
There is a key difference in Haskell in having two definitions of the same name and identical scopes, and having two definitions of the same name in nested scopes. GHCi vs modules in a file isn't really related to the underlying concept here, but those situations do lead you to encounter problems if you're not familiar with it.
A let-expression (and a let-statement in a do block) creates a set of bindings with the same scope, not just a single binding. For example, as an expression:
let a = True
a = False
in a
Or with braces and semicolons (more convenient to paste into GHCi without turning on multi-line mode):
let { a = True; a = False} in a
This will fail, whether in a module or in GHCi. There cannot be a single variable a that is both True and False, and there can't be two separate variables named a in the same scope (or it would be impossible to know which one was being referred to by the source text a).
The variables in a single binding set are all defined "at once"; the order they're written in is not relevant at all. You can see this because it's possible to define mututally-recursive bindings that all refer to each other, and couldn't possibly be defined one-at-a-time in any order:
λ let a = True : b
| b = False : a
| in take 10 a
[True,False,True,False,True,False,True,False,True,False]
it :: [Bool]
Here I've defined an infinite list of alternating True and False, and used it to come up with a finite result.
A Haskell module is a single scope, containing all the definitions in the file. Exactly as in a let-expression with multiple bindings, all the definitions "happen at once"1; they're only in a particular order because writing them down in a file inevitably introduces an order. So in a module this:
a = True
a = False
gives you an error, as you've seen.
In a do-block you have let-statements rather than let-expressions.2 These don't have an in part since they just scope over the entire rest of the do-block.3 GHCi commands are very like entering statements in an IO do-block, so you have the same option there, and that's what you're using in your example.
However your example has two let-bindings, not one. So there are two separate variables named a defined in two separate scopes.
Haskell doesn't care (almost ever) about the written order of different definitions, but it does care about the "nesting order" of nested scopes; the rule is that when you refer to a variable a, you get the inner-most definition of a whose scope contains the reference.4
As an aside, hiding an outer-scope name by reusing a name in an inner scope is known as shadowing (we say the inner definition shadows the outer one). It's a useful general programming term to know, since the concept comes up in many languages.
So it's not that the rules about when you can define a name twice are different in GHCi vs a module, its just that the different context makes different things easier.
If you want to put a bunch of definitions in a module, the easy thing to do is make them all top-level definitions, which all have the same scope (the whole module) and so you get an error if you use the same name twice. You have to work a bit more to nest the definitions.
In GHCi you're entering commands one-at-a-time, and it's more work to use multi-line commands or braces-and-semicolon style, so the easy thing when you want to enter several definitions is to use several let statements, and so you end up shadowing earlier definitions if you reuse names.5 You have to more deliberately try to actually enter multiple names in the same scope.
1 Or more accurately the bindings "just are" without any notion of "the time at which they happen" at all.
2 Or rather: you have let-statements as well as let-expressions, since statements are mostly made up of expressions and a let-expression is always valid as an expression.
3 You can see this as a general rule that later statements in a do-block are conceptually nested inside all earlier statements, since that's what they mean when you translate them to monadic operations; indeed let-statements are actually translated to let-expressions with the rest of the do-block inside the in part.
4 It's not ambiguous like two variables with the same name in the same scope would be, though it is impossible to refer to any further-out definitions.
5 And note that anything you've previously defined referring to the name before the shadowing will still behave exactly as it did before, referring to the previous name. This includes functions that return the value of the variable. It's easiest to understand shadowing as introducing a different variable that happens to have the same name as an earlier one, rather than trying to understand it as actually changing what the earlier variable name refers to.

Unable to chain-access tuple types

Given:
struct NameType([u8;64]);
name: (NameType, NameType);
I can do:
let def = &name.0 OR &name.1
but I cannot do:
let def = &name.0.0 OR &name.1.0
to access the internals. I have to do it twice:
let abc = &name.0;
let def = &abc.0;
why am I unable to chain it to access inner sub-tuples, tuple-structs etc?
rustc 1.0.0-nightly (ecf8c64e1 2015-03-21) (built 2015-03-22)
As mentioned in the comments, foo.0.0 will be parsed as having a number. This was originally mentioned in the RFC, specifically this:
I'd rather not change the lexer to permit a.0.1. I'd rather just have that be an error and have people write out the names. We could always add it later.
You can certainly file a bug, but as a workaround, use parenthesis:
(foo.0).0
In my opinion, you shouldn't be nesting tuples that deep anyway. I'd highly recommend giving names to fields before you slowly go insane deciding if you wanted foo.0.1.2 or foo.1.2.0.
In addition to above answers, I have also found out that a gap would work wonders :) So;
foo.0. 0 OR foo.0 . 0 etc all work
is fine. Don't know how much it means but there is a way to chain it if somebody wants to though (without resorting to brackets)

Resources