What benefits are there with making println a macro? - rust

In this code, there is a ! after the println:
fn main() {
println!("Hello, world!");
}
In most languages I have seen, the print operation is a function. Why is it a macro in Rust?

By being a procedural macro, println!() gains the ability to:
Automatically reference its arguments. For example this is valid:
let x = "x".to_string();
println!("{}", x);
println!("{}", x); // Works even though you might expect `x` to have been moved on the previous line.
Accept an arbitrary number of arguments.
Validate, at compile time, that the format string placeholders and arguments match up. This is a common source of bugs with C's printf().
None of those are possible with plain functions or methods.
See also:
Does println! borrow or own the variable?
How can I create a function with a variable number of arguments?
Is it possible to write something as complex as `print!` in a pure Rust macro?
What is the difference between macros and functions in Rust?

Well, lets pretend we made those functions for a moment.
fn println<T: Debug>(format: &str, args: &[T]) {}
We would take in some format string and arguments to pass to format to it. So if we did
println("hello {:?} is your value", &[3]);
The code for println would search for and replace the {:?} with the debug representation for 3.
That's con 1 of doing these as functions - that string replacement needs to be done at runtime. If you have a macro you could imagine it essentially being the same as
print("hello ");
print("3");
println(" is your value);
But when its a function there needs to be runtime scanning and splitting of the string.
In general rust likes to avoid unneeded performance hits so this is a bummer.
Next is that T in the function version.
fn println<T: Debug>(format: &str, args: &[T]) {}
What this signature I made up says is that it expects an slice of things that implement Debug. But it also means that it expects all elements in the slice to be the same type, so this
println("Hello {:?}, {:?}", &[99, "red balloons"]);
wouldn't work because u32 and &'static str aren't the same T and therefore could be different sizes on the stack. To get that to work you'd need to do something like boxing each element and doing dynamic dispatch.
println("Hello {:?}, {:?}", &[Box::new(99), Box::new("red balloons")]);
This way you could have every element be Box<dyn Debug>, but you now have even more unneeded performance hits and the usage is starting to look kinda gnarly.
Then there is the requirement that they want to support printing both Debug and Display implementations.
println!("{}, {:?}", 10, 15);
and at this point there isn't a way to express this as a normal rust function.
There are more motivating reasons i'm sure, but this is just a sampling.
For (fun?) lets compare this to what happens in Java in similar circumstances.
In Java everything is, or can be, heap allocated. Everything also "inherits" a toString method from Object, meaning you can get a string representation for anything in your program using dynamic dispatch.
So when you use String.format, you get something similar to what is above for println.
public static String format(String format, Object... args) {
return new Formatter().format(format, args).toString();
}
Object... is just special syntax for accepting an array as a second argument at runtime that the Java compiler will let you write without the array explicitly there with {}s.
The big difference is that, unlike rust where different types have different sizes, things in Java are always* behind pointers. Therefore you don't need to know T ahead of time to make the bytecode/machine code to do this.
String.format("Hello %s, %s", 99, "red baloons");
which is doing much the same mechanically as this (ignoring JIT)
println("Hello {:?}, {:?}", &[Box::new(99), Box::new("red balloons")]);
So rust's problem is, how do you provide ergonomics at least as good as or greater than the Java version - which is what many are used to - without incurring unneeded heap allocations or dynamic dispatch. Macros give a mechanism for that solution.
(Java can also solve things like the Debug/Display issue since you can check at runtime for implemented interfaces, but that's not core to the reasoning here)
Add on the fact that using a macro instead of a function that takes a string and array means you can provide compile time errors for mismatched or missing arguments, and its a pretty solid design choice.

Related

How to create a function interface that computes its output from a vec input using a thread

I want a worker to process a Vec<&str> from within a function that takes that vector as a parameter.
The worker lives and dies in that function through a join().
However, I cannot just use it from within the thread, first intuitively because it does not seem safe to have the variable be usable from multiple threads, and second because the compiler does not like its lifetime:
list has an anonymous lifetime '_ but it needs to satisfy a 'static lifetime requirement
Rust playground
I do not want it to have a static lifetime, since that would propagate lifetime management complexity to the consumer of the function. Intuitively, it should be feasible, since my intent is for the worker to end before the function returns.
Approach 1: Using a local variable with a cloned version of the vector is not sufficient, for reasons that are not perfectly clear to me. I believe that &str does not get deeply cloned when the vector does.
The idea, though, is what I want my computer to do: I do want it to have a completely separate copy in RAM moved into the thread, starting with the same string bytes, and decorrelated lifetimes.
Approach 2: The same is true when using message passing (playground).
Approach 3: Boxes, Arc, etc won’t help us here. At least, that maps to my intuition, since I assume the lifetime issue relates to something somewhere in the data structure remaining linked to the parameter.
Approach 4: Forcing a deeper clone does not work on &str (playground), but it works on String (playground).
However, the worker uses library functions that take &Vec<&str>, and deref coercion does not work for iterators (playground).
value of type Vec<&str> cannot be built from std::iter::Iterator<Item=String>
Is there an approach that works better than those outlined?
The last approach I can think of is to do an unsafe memory copy of the strings, but I wonder whether it might be a little bit outrageous.
What's missing from approach 4 is the conversion from Vec<String> to Vec<&str> within your worker.
This can be achieved with a function like this:
fn slice_as_ref(s: &[String]) -> Vec<&str> {
s.iter().map(|s| s.as_str()).collect()
}
See it in action here: playground.
You can use crossbeam's scoped threads. This API is also going to be part of the standard library.
fn worker(list: &Vec<&str>) {
crossbeam::scope(|s| {
s.spawn(|_| println!("list: {:?}", list));
})
.unwrap();
}
Playground.
(Though note that it is more idiomatic to take &[&str] than &Vec<&str>).

Rust: can I have a fixed size slice by borrowing the whole fixed size array in a smaller scope in a simple way

I saw the workarounds and they where kinda long. Am I missing a feature of Rust or a simple solution (Important: not workaround). I feel like I should be able to do this with maybe a simple macro but arrayref crate implementations aren't what I am looking for. Is this a feature that needs to be added to Rust or creating fixed size slicing from fixed sized array in a smaller scope is something bad.
Basically what I want to do is this;
fn f(arr:[u8;4]){
arr[0];
}
fn basic(){
let mut arr:[u8;12] = [0;12];
// can't I borrow the whole array but but a fixed slice to it?
f(&mut arr[8..12]); // But this is know on compile time?
f(&mut arr[8..12] as &[u8;4]); // Why can't I do these things?
}
What I want can be achieved by below code(from other so threads)
use array_ref;
fn foo(){
let buf:[u8;12] = [0;12];
let (_, fixed_slice) = mut_array_refs![
&mut buf,
8,
4
];
write_u32_into(fixed_slice,0);
}
fn write_u32_into(fixed_slice:&mut [u8;12],num:u32){
// won't have to check if fixed_slice.len() == 12 and won't panic
}
But I looked into the crate and even though this never panics there are many unsafe blocks and many lines of code. It is a workaround for the Rust itself. In the first place I wanted something like this to get rid of the overhead of checking the size and the possible runtime panic.
Also this is a little overhead it doesn't matter isn't a valid answer because technically I should be able to guarantee this in compile time even if the overhead is small this doesn't mean rust doesn't need to have this type of feature or I should not be looking for an ideal way.
Note: Can this be solved with lifetimes?
Edit: If we where able to have a different syntax for fixed slices such as arr[12;;16] and when I borrowed them this way it would borrow it would borrow the whole arr. I think this way many functions for example (write_u32) would be implemented in a more "rusty" way.
Use let binding with slice_patterns feature. It was stabilized in Rust 1.42.
let v = [1, 2, 3]; // inferred [i32; 3]
let [_, ref subarray # ..] = v; // subarray is &[i32; 2]
let a = v[0]; // 1
let b = subarray[1]; // 3
Here is a section from the Rust reference about slice patterns.
Why it doesn't work
What you want is not available as a feature in rust stable or nightly because multiple things related to const are not stabilized yet, namely const generics and const traits. The reason traits are involved is because the arr[8..12] is a call to the core::ops::Index::<Range<usize>> trait that returns a reference to a slice, in your case [u8]. This type is unsized and not equal to [u8; 4] even if the compiler could figure out that it is, rust is inherently safe and can be overprotective sometimes to ensure safety.
What can you do then?
You have a few routes you can take to solve this issue, I'll stay in a no_std environment for all this as that seems to be where you're working and will avoid extra crates.
Change the function signature
The current function signature you have takes the four u8s as an owned value. If you only are asking for 4 values you can instead take those values as parameters to the function. This option breaks down when you need larger arrays but at that point, it would be better to take the array as a reference or using the method below.
The most common way, and the best way in my opinion, is to take the array in as a reference to a slice (&[u8] or &mut [u8]). This is not the same as taking a pointer to the value in C, slices in rust also carry the length of themselves so you can safely iterate through them without worrying about buffer overruns or if you read all the data. This does require changing the algorithms below to account for variable-sized input but most of the time there is a just as good option to use.
The safe way
Slice can be converted to arrays using TryInto, but this comes at the cost of runtime size checking which you seem to want to avoid. This is an option though and may result in a minimal performance impact.
Example:
fn f(arr: [u8;4]){
arr[0];
}
fn basic(){
let mut arr:[u8;12] = [0;12];
f(arr[8..12].try_into().unwrap());
}
The unsafe way
If you're willing to leave the land of safety there are quite a few things you can do to force the compiler to recognize the data as you want it to, but they can be abused. It's usually better to use rust idioms rather than force other methods in but this is a valid option.
fn basic(){
let mut arr:[u8;12] = [0;12];
f(unsafe {*(arr[8..12].as_ptr() as *const [u8; 4])});
}
TL;DR
I recommend changing your types to utilize slices rather than arrays but if that's not feasible I'd suggest avoiding unsafety, the performance won't be as bad as you think.

Is it possible to call a function in Rust by naming its arguments?

I don't know whether this is considered to be a good programming practice but personally, I find that calling a function by naming its arguments makes the code more readable. I don't know whether this is possible in Rust programming language. I didn't find any named call expression in the grammar:
https://doc.rust-lang.org/reference/expressions/call-expr.html
So for example the following doesn't compile:
fn add(n1 : i32, n2 : i32) -> i32 {
n1 + n2
}
fn main() {
let sum_value = add(n1 = 124, n2 = 200);
println!("sum = {}", sum_value);
}
Therefore my question is: Is naming arguments in function call is possible in Rust and if the answer is yes, is it considered to be a good practice according to Rust best practices? (I'm a beginner)
Environment:
OS: Linux Ubuntu MATE 20.04.1 (64 bits)
rustc --version: rustc 1.46.0 (04488afe3 2020-08-24)
Therefore my question is: Is naming arguments in function call is possible in Rust
Rust does not support named parameters as part of the language.
is it considered to be a good practice according to Rust best practices? (I'm a beginner)
Generally not (the rather strong typing usually helps mitigate this issue)
In cases where it really is useful two patterns crop up repeatedly:
an options struct, the function would have a limited number of simple parameters and a named structure to pass more structured data, providing "naming"
the builder pattern, which is an evolved and more literate version of the former (and more conducive to optional parameters)
See the article linked to https://old.reddit.com/r/rust/comments/fg6vrn/for_your_consideration_an_alternative_to_the/ for more information (I'm linking to the thread because there is useful discussion of the options, as well as links to helpful crates).
There is an RFC open on this topic.
Although not everyone in this RFC agrees, it seems that there is generally support for adding named and default parameters to Rust (including support from the core team). It seems the primary obstacle is not in the concept, but in the implementation.
As noted in the RFC, alternatives (such as the builder pattern) have the same issues as built-in named and default parameters, but also add large amounts of boilerplate.
No, there are no named/keyword parameters in Rust. They have been discussed for a long time, but there are no concrete plans to add them.
If you have many parameters in a function, consider passing a struct, the builder pattern, etc.
By the way: the add() example does not show why named/keyword parameters could be useful, since the parameters are interchangeable.
You can work around it with #![feature(adt_const_params)] if you're in nightly. It allows you to use &'static str contants as type parameters. It's clunky but it's still the best I could come up with so far:
#![feature(adt_const_params)]
pub struct Param<const NAME: &'static str, T>(pub T);
fn foo(
Param(person_name): Param::<"person_name", String>,
Param(age): Param::<"age", u8>,
){
println!("{} is {} years old", person_name, age)
}
fn main() {
foo(
Param::<"person_name", _>("Bob".to_owned()),
Param::<"age", _>(123)
)
}
The Good:
The compiler will error out if you get the names of the params wrong, or if you try to use them in the incorrect order;
No macros;
No new types for every parameter in every function;
Unlike the builder pattern, you can't set the same parameter multiple times.
The Bad
Nothing guarantees that the actual name of the function parameters and the const &'static str that you used are identical, which is disappointing but not very error prone;
You can have multiple parameters with the same name;
Syntax is pretty verbose on the call site;
Getting the value out of the parameter needs either destructuring like in the example (thanks, #Chayim Friedman
) or something like person_name.0;
You can't change the parameter order, though that's pretty minor.
There are no named parameters in the language, but they can be emulated with macros. See https://crates.io/crates/named for an experimental solution.

Why are len() and is_empty() not defined in a trait?

Most patterns in Rust are captured by traits (Iterator, From, Borrow, etc.).
How come a pattern as pervasive as len/is_empty has no associated trait in the standard library? Would that cause problems which I do not foresee? Was it deemed useless? Or is it only that nobody thought of it (which seems unlikely)?
Was it deemed useless?
I would guess that's the reason.
What could you do with the knowledge that something is empty or has length 15? Pretty much nothing, unless you also have a way to access the elements of the collection for example. The trait that unifies collections is Iterator. In particular an iterator can tell you how many elements its underlying collection has, but it also does a lot more.
Also note that should you need an Empty trait, you can create one and implement it for all standard collections, unlike interfaces in most languages. This is the power of traits. This also means that the standard library doesn't need to provide small utility traits for every single use case, they can be provided by libraries!
Just adding a late but perhaps useful answer here. Depending on what exactly you need, using the slice type might be a good option, rather than specifying a trait. Slices have len(), is_empty(), and other useful methods (full docs here). Consider the following:
use core::fmt::Display;
fn printme<T: Display>(x: &[T]) {
println!("length: {}, empty: ", x.len());
for c in x {
print!("{}, ", c);
}
println!("\nDone!");
}
fn main() {
let s = "This is some string";
// Vector
let vv: Vec<char> = s.chars().collect();
printme(&vv);
// Array
let x = [1, 2, 3, 4];
printme(&x);
// Empty
let y:Vec<u8> = Vec::new();
printme(&y);
}
printme can accept either a vector or an array. Most other things that it accepts will need some massaging.
I think maybe the reason for there being no Length trait is that most functions will either a) work through an iterator without needing to know its length (with Iterator), or b) require len because they do some sort of random element access, in which case a slice would be the best bet. In the first case, knowing length may be helpful to pre-allocate memory of some size, but size_hint takes care of this when used for anything like Vec::with_capacity, or ExactSizeIterator for anything that needs specific allocations. Most other cases would probably need to be collected to a vector at some point within the function, which has its len.
Playground link to my example here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9a034c2e8b75775449afa110c05858e7

Dynamically inferring the type of a string

Rust newbie here. What would be a good way to go about dynamically inferring the most probably type given a string? I am trying to code a function that given a string returns the most possible type but I have no idea where to start. In Python I would probably use a try-except block. This is what I would expect to have:
"4" -> u32 (or u64)
"askdjf" -> String
"3.2" -> f64
and so on? I know that some strings can be assigned to several possible types so the problem is not well defined but I am only interested in the general philosophy on how to solve the problem efficiently in rust.
There is a parse method on string slices (&str) that attempts to parse a string as a particular type. You'll have to know the specific types you're ready to handle, though. The parse method can return values of any type that implements FromStr.
fn main() {
if let Ok(i) = "1".parse::<u32>() {
println!("{}", i);
}
if let Ok(f) = "1.1".parse::<f64>() {
println!("{}", f);
}
}
Note that the ::<T> part is only necessary if the compiler is unable to infer what type you're trying to parse into (you'll get a compiler error in that case).
I am trying to code a function that given a string returns the most possible type but I have no idea where to start.
First of all: Rust is statically typed which means that a function returns one and only one type, so you can't just return different types, like in dynamically typed languages. However, there are ways to simulate dynamic typing -- namely two (that I can think of):
enum: If you have a fixed number of possible types, you could define an enum with one variant per type, like this:
enum DynType {
Integer(i64),
Float(f32),
String(String),
}
fn dyn_parse(s: &str) -> DynType {
...
}
You can read more on enums in this and the following Rust book chapter.
There is a trait in the standard library designed to simulate dynamic typing: Any. There is more information here. Your code could look like this:
fn dyn_parse(s: &str) -> Box<Any> {
...
}
You can't return trait objects directly, so you have to put it in a Box.
Keep in mind that both possibilities require the user of your function to do additional dispatch. Since Rust is statically typed, you can't do the things you are used to in a dynamically typed language.
Maybe you should try to solve your problems in a different way that makes more sense in the statically typed world.
About the implementation part: Like Francis Gagné said, there is parse which tries to parse a string as a type the programmer specifies. You could of course just chain those parse calls with different types and take the first one that succeeds. But this might not be what you want and maybe not the fastest implementation.
Of course you should first think of exact rules what string should parse as what type. After that you could, for example, build a finite state machine that detects the type of the string. Doing that properly could be a bit tricky though.

Resources