Can I create my own conditional compilation attributes? - rust

There are several ways of doing something in my crate, some result in fast execution, some in low binary size, some have other advantages, so I provide the user interfaces to all of them. Unused functions will be optimized away by the compiler. Internal functions in my crate have to use these interfaces as well, and I would like them to respect the user choice at compile time.
There are conditional compilation attributes like target_os, which store a value like linux or windows. How can I create such an attribute, for example prefer_method, so I and the user can use it somewhat like in the following code snippets?
My crate:
#[cfg(not(any(
not(prefer_method),
prefer_method = "fast",
prefer_method = "small"
)))]
compile_error("invalid `prefer_method` value");
pub fn bla() {
#[cfg(prefer_method = "fast")]
foo_fast();
#[cfg(prefer_method = "small")]
foo_small();
#[cfg(not(prefer_method))]
foo_default();
}
pub fn foo_fast() {
// Fast execution.
}
pub fn foo_small() {
// Small binary file.
}
pub fn foo_default() {
// Medium size, medium fast.
}
The user crate:
#[prefer_method = "small"]
extern crate my_crate;
fn f() {
// Uses the `foo_small` function, the other `foo_*` functions will not end up in the binary.
my_crate::bla();
// But the user can also call any function, which of course will also end up in the binary.
my_crate::foo_default();
}
I know there are --cfg attributes, but AFAIK these only represent boolean flags, not enumeration values, which allow setting multiple flags when only one enumeration value is valid.

Firstly, the --cfg flag supports key-value pairs using the syntax --cfg 'prefer_method="fast"'. This will allow you to write code like:
#[cfg(prefer_method = "fast")]
fn foo_fast() { }
You can also set these cfg options from a build script. For example:
// build.rs
fn main() {
println!("cargo:rustc-cfg=prefer_method=\"method_a\"");
}
// src/main.rs
#[cfg(prefer_method = "method_a")]
fn main() {
println!("It's A");
}
#[cfg(prefer_method = "method_b")]
fn main() {
println!("It's B");
}
#[cfg(not(any(prefer_method = "method_a", prefer_method = "method_b")))]
fn main() {
println!("No preferred method");
}
The above code will result in an executable that prints "It's A".
There's no syntax like the one you suggest to specify cfg settings. The best thing to expose these options to your crates' users is through Cargo features.
For example:
# Library Cargo.toml
# ...
[features]
method_a = []
method_b = []
// build.rs
fn main() {
// prefer method A if both method A and B are selected
if cfg!(feature = "method_a") {
println!("cargo:rustc-cfg=prefer_method=\"method_a\"");
} else if cfg!(feature = "method_b") {
println!("cargo:rustc-cfg=prefer_method=\"method_b\"");
}
}
# User Cargo.toml
# ...
[dependencies.my_crate]
version = "..."
features = ["method_a"]
However, in this case, I'd recommend just using the Cargo features directly in your code (i.e. #[cfg(feature = "fast")]) rather than adding the build script since there's a one-to-one correspondence between the cargo feature and the rustc-cfg being added.

Related

How to pretty print Syn AST?

I'm trying to use syn to create an AST from a Rust file and then using quote to write it to another. However, when I write it, it puts extra spaces between everything.
Note that the example below is just to demonstrate the minimum reproducible problem I'm having. I realize that if I just wanted to copy the code over I could copy the file but it doesn't fit my case and I need to use an AST.
pub fn build_file() {
let current_dir = std::env::current_dir().expect("Unable to get current directory");
let rust_file = std::fs::read_to_string(current_dir.join("src").join("lib.rs")).expect("Unable to read rust file");
let ast = syn::parse_file(&rust_file).expect("Unable to create AST from rust file");
match std::fs::write("src/utils.rs", quote::quote!(#ast).to_string());
}
The file that it creates an AST of is this:
#[macro_use]
extern crate foo;
mod test;
fn init(handle: foo::InitHandle) {
handle.add_class::<Test::test>();
}
What it outputs is this:
# [macro_use] extern crate foo ; mod test ; fn init (handle : foo :: InitHandle) { handle . add_class :: < Test :: test > () ; }
I've even tried running it through rustfmt after writing it to the file like so:
utils::write_file("src/utils.rs", quote::quote!(#ast).to_string());
match std::process::Command::new("cargo").arg("fmt").output() {
Ok(_v) => (),
Err(e) => std::process::exit(1),
}
But it doesn't seem to make any difference.
The quote crate is not really concerned with pretty printing the generated code. You can run it through rustfmt, you just have to execute rustfmt src/utils.rs or cargo fmt -- src/utils.rs.
use std::fs;
use std::io;
use std::path::Path;
use std::process::Command;
fn write_and_fmt<P: AsRef<Path>, S: ToString>(path: P, code: S) -> io::Result<()> {
fs::write(&path, code.to_string())?;
Command::new("rustfmt")
.arg(path.as_ref())
.spawn()?
.wait()?;
Ok(())
}
Now you can just execute:
write_and_fmt("src/utils.rs", quote::quote!(#ast)).expect("unable to save or format");
See also "Any interest in a pretty-printing crate for Syn?" on the Rust forum.
As Martin mentioned in his answer, prettyplease can be used to format code fragments, which can be quite useful when testing proc macro where the standard to_string() on proc_macro2::TokenStream is rather hard to read.
Here a code sample to pretty print a proc_macro2::TokenStream parsable as a syn::Item:
fn pretty_print_item(item: proc_macro2::TokenStream) -> String {
let item = syn::parse2(item).unwrap();
let file = syn::File {
attrs: vec![],
items: vec![item],
shebang: None,
};
prettyplease::unparse(&file)
}
I used this in my tests to help me understand where is the wrong generated code:
assert_eq!(
expected.to_string(),
generate_event().to_string(),
"\n\nActual:\n {}",
pretty_print_item(generate_event())
);
Please see the new prettyplease crate. Advantages:
It can be used directly as a library.
It can handle code fragments while rustfmt only handles full files.
It is fast because it uses a simpler algorithm.
Similar to other answers, I also use prettyplease.
I use this little trick to pretty-print a proc_macro2::TokenStream (e.g. what you get from calling quote::quote!):
fn pretty_print(ts: &proc_macro2::TokenStream) -> String {
let file = syn::parse_file(&ts.to_string()).unwrap();
prettyplease::unparse(&file)
}
Basically, I convert the token stream to an unformatted String, then parse that String into a syn::File, and then pass that to prettyplease package.
Usage:
#[test]
fn it_works() {
let tokens = quote::quote! {
struct Foo {
bar: String,
baz: u64,
}
};
let formatted = pretty_print(&tokens);
let expected = "struct Foo {\n bar: String,\n baz: u64,\n}\n";
assert_eq!(formatted, expected);
}

Rust/rocket pass variable to endpoints

Not to my preference but I'm forced to write some Rust today so I'm trying to create a Rocket instance with only one endpoint but, on that endpoint I need to access a variable that is being created during main. The variable takes a long time to be instantiated so that's why I do it there.
My problem is that I can;t find a way to pass it safely. Whatever I do, the compiler complaints about thread safety even though the library appears to be thread safe: https://github.com/brave/adblock-rust/pull/130 (commited code is found on my local instance)
This is the error tat I get:
|
18 | / lazy_static! {
19 | | static ref rules_engine: Mutex<Vec<Engine>> = Mutex::new(vec![]);
20 | | }
| |_^ `std::rc::Rc<std::cell::RefCell<lifeguard::CappedCollection<std::vec::Vec<u64>>>>` cannot be sent between threads safely
|
...and this is my code:
#![feature(proc_macro_hygiene, decl_macro)]
#[macro_use]
extern crate rocket;
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
use lazy_static::lazy_static;
use std::sync::Mutex;
use adblock::engine::Engine;
use adblock::lists::FilterFormat;
use rocket::request::{Form, FormError, FormDataError};
lazy_static! {
static ref rules_engine: Mutex<Vec<Engine>> = Mutex::new(vec![]);
}
fn main() {
if !Path::new("./rules.txt").exists() {
println!("rules file does not exist")
} else {
println!("loading rules");
let mut rules = vec![];
if let Ok(lines) = read_lines("./rules.txt") {
for line in lines {
if let Ok(ip) = line {
rules.insert(0, ip)
}
}
let eng = Engine::from_rules(&rules, FilterFormat::Standard);
rules_engine.lock().unwrap().push(eng);
rocket().launch();
}
}
}
#[derive(Debug, FromForm)]
struct FormInput<> {
#[form(field = "textarea")]
text_area: String
}
#[post("/", data = "<sink>")]
fn sink(sink: Result<Form<FormInput>, FormError>) -> String {
match sink {
Ok(form) => {
format!("{:?}", &*form)
}
Err(FormDataError::Io(_)) => format!("Form input was invalid UTF-8."),
Err(FormDataError::Malformed(f)) | Err(FormDataError::Parse(_, f)) => {
format!("Invalid form input: {}", f)
}
}
}
fn rocket() -> rocket::Rocket {
rocket::ignite().mount("/", routes![sink])
}
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
Any way of having the eng available inside the sink endpoint method?
Rc is not thread safe, even behind a mutex. It looks like Rc is used in eng.blocker.pool.pool which is a lifeguard::Pool. So no, the Engine is not thread safe (at least by default).
Fortunately, it appears that the adblock crate has a feature called "object-pooling", which enables that specific functionality. Removing that feature will (hopefully) make it thread safe.
Rocket makes it really easy to share resources between routes (and also between main or any other thread you might have spawned from main). They call their mechanism state. Check out its documentation here.
To give a short example of how it works:
You create your type that you want to share in your application and manage an instance of that type in the instance of rocket that you use for your application. In the guide they give this example:
use std::sync::atomic::AtomicUsize;
struct HitCount {
count: AtomicUsize
}
rocket::build().manage(HitCount { count: AtomicUsize::new(0) });
In a route then you access the resource like this (again from the guide):
use rocket::State;
#[get("/count")]
fn count(hit_count: &State<HitCount>) -> String {
let current_count = hit_count.count.load(Ordering::Relaxed);
format!("Number of visits: {}", current_count)
}
While I learnt rocket I needed to share a struct that contained a String, which is not thread safe per se. That means you need to wrap it into a Mutex before you can manage it with rocket.
Also, as far as I understand, only one resource of any specific type can be shared with manage. But you can just create differently named wrapper types in that case and work around that limitation.

How do I conditionally compile for WebAssembly in Rust?

How can I make a config flag where I conditionally choose the wasm32-unknown-unkown target?
I printed the current environment using the following build.rs:
use std::env;
fn main() {
for (key, value) in env::vars() {
if key.starts_with("CARGO_CFG_") {
println!("{}: {:?}", key, value);
}
}
panic!("stop and dump stdout");
}
Which prints:
CARGO_CFG_DEBUG_ASSERTIONS: ""
CARGO_CFG_TARGET_ARCH: "wasm32"
CARGO_CFG_TARGET_ENDIAN: "little"
CARGO_CFG_TARGET_ENV: ""
CARGO_CFG_TARGET_HAS_ATOMIC: "16,32,8,ptr"
CARGO_CFG_TARGET_OS: "unknown"
CARGO_CFG_TARGET_POINTER_WIDTH: "32"
CARGO_CFG_TARGET_VENDOR: "unknown"
Normally I would do #[cfg(target_os = "linux")] but that probably doesn't work in this case because #[cfg(target_os = "unknown")] probably matches more than wasm32-unknown-unknown. Do I have to use a combination of target_arch and target_os for this to work properly or maybe just target_arch?
This is how stdweb is doing it:
#[cfg(all(target_arch = "wasm32", target_os = "unknown"))]
I tested it out and it looks like something simple like this works just fine:
#[cfg(target_arch = "wasm32")]
fn add_seven(x: i32) -> i32 {
x + 7
}
#[cfg(not(target_arch = "wasm32"))]
fn add_seven(x: i32) -> i32 {
x + 6
}
fn main() {
let eight = add_seven(1);
println!("{}", eight);
}
Conditional compilation in Rust allows for a great amount of granularity in that you can specify OS, architecture, etc. If you do not need that granularity then you do not have to use it.
There are unknown and emscripten OS targets for wasm32, so it would be best to differentiate the two if your code needs to be different for the two platforms.
Stdweb has chosen to use the more granular approach. If I were doing it I would follow what they are doing, but it seems like it would work either way.

Is it possible to declare config tokens within a source file?

In Rust, it's possible to perform conditional compilation as follows.
#[cfg(rust_version = "1.10")]
fn my_func() {}
Is it possible to define variables for cfg to check within the same source file?
For example:
// leave off, just a quick test to enable when troubleshooting.
#define use_counter 1 // C style (not valid Rust)
#[cfg(use_counter == "1")]
static mut fn_counter: usize = 0;
fn my_func() {
#[cfg(use_counter = "1")]
unsafe { fn_counter += 1; }
}
main () {
// code calling 'my_func'
// print how many times the function is called.
#[cfg(use_counter = "1")]
unsafe { println!("Function count {}", fn_counter); }
}
I'm not asking how to write a function counter, it's just an example of optionally inserting logic into a source file.
Yes, this is written as #[cfg(use_counter)]. Such flags can be enabled or disabled on the command line at compile time and are not exposed in Cargo.toml.
fn main() {
#[cfg(use_counter)]
println!("counter is enabled");
}
Using Cargo, run with the feature disabled:
$ cargo run
Using Cargo, run with the feature enabled:
$ RUSTFLAGS="--cfg use_counter" cargo run
Compile directly with the feature disabled:
$ rustc src/main.rs
Compile with the feature enabled:
$ rustc src/main.rs --cfg use_counter

Finding executable in PATH with Rust

In Python I can:
from distutils import spawn
cmd = spawn.find_executable("commandname")
I tried something like the code below, but it it assumes you're on unix-like system with /usr/bin/which available(also it involves execution of external command which I want to avoid):
use std::process::Command;
let output = Command::new("which")
.arg("commandname")
.unwrap_or_else(|e| /* handle error here */)
What is the simplest way to do this in Rust?
I found a crate that solves the problem: which. It includes Windows support, even accounting for PATHEXT.
I'd probably grab the environment variable and iterate through it, returning the first matching path:
use std::env;
use std::path::{Path, PathBuf};
fn find_it<P>(exe_name: P) -> Option<PathBuf>
where P: AsRef<Path>,
{
env::var_os("PATH").and_then(|paths| {
env::split_paths(&paths).filter_map(|dir| {
let full_path = dir.join(&exe_name);
if full_path.is_file() {
Some(full_path)
} else {
None
}
}).next()
})
}
fn main() {
println!("{:?}", find_it("cat"));
println!("{:?}", find_it("dog"));
}
This is probably ugly on Windows as you'd have to append the .exe to the executable name. It should also potentially be extended to only return items that are executable, which is again platform-specific code.
Reviewing the Python implementation, it appears they also support an absolute path being passed. That's up to you if the function should support that or not.
A quick search on crates.io returned one crate that may be useful: quale, although it currently says
currently only works on Unix-like operating systems.
It wouldn't surprise me to find out there are others.
Here's some ugly code that adds .exe to the end if it's missing, but only on Windows.
#[cfg(not(target_os = "windows"))]
fn enhance_exe_name(exe_name: &Path) -> Cow<Path> {
exe_name.into()
}
#[cfg(target_os = "windows")]
fn enhance_exe_name(exe_name: &Path) -> Cow<Path> {
use std::ffi::OsStr;
use std::os::windows::ffi::OsStrExt;
let raw_input: Vec<_> = exe_name.as_os_str().encode_wide().collect();
let raw_extension: Vec<_> = OsStr::new(".exe").encode_wide().collect();
if raw_input.ends_with(&raw_extension) {
exe_name.into()
} else {
let mut with_exe = exe_name.as_os_str().to_owned();
with_exe.push(".exe");
PathBuf::from(with_exe).into()
}
}
// At the top of the `find_it` function:
// let exe_name = enhance_exe_name(exe_name.as_ref());

Resources