What is the correct & idiomatic way to check if a string starts with a certain character in Rust? - string

I want to check whether a string starts with some chars:
for line in lines_of_text.split("\n").collect::<Vec<_>>().iter() {
let rendered = match line.char_at(0) {
'#' => {
// Heading
Cyan.paint(*line).to_string()
}
'>' => {
// Quotation
White.paint(*line).to_string()
}
'-' => {
// Inline list
Green.paint(*line).to_string()
}
'`' => {
// Code
White.paint(*line).to_string()
}
_ => (*line).to_string(),
};
println!("{:?}", rendered);
}
I've used char_at, but it reports an error due to its instability.
main.rs:49:29: 49:39 error: use of unstable library feature 'str_char': frequently replaced by the chars() iterator, this method may be removed or possibly renamed in the future; it is normally replaced by chars/char_indices iterators or by getting the first char from a subslice (see issue #27754)
main.rs:49 let rendered = match line.char_at(0) {
^~~~~~~~~~
I'm currently using Rust 1.5

The error message gives useful hints on what to do:
frequently replaced by the chars() iterator, this method may be removed or possibly renamed in the future; it is normally replaced by chars/char_indices iterators or by getting the first char from a subslice (see issue #27754)
We could follow the error text:
for line in lines_of_text.split("\n") {
match line.chars().next() {
Some('#') => println!("Heading"),
Some('>') => println!("Quotation"),
Some('-') => println!("Inline list"),
Some('`') => println!("Code"),
Some(_) => println!("Other"),
None => println!("Empty string"),
};
}
Note that this exposes an error condition you were not handling! What if there was no first character?
We could slice the string and then pattern match on string slices:
for line in lines_of_text.split("\n") {
match &line[..1] {
"#" => println!("Heading"),
">" => println!("Quotation"),
"-" => println!("Inline list"),
"`" => println!("Code"),
_ => println!("Other")
};
}
Slicing a string operates by bytes and thus this will panic if your first character isn't exactly 1 byte (a.k.a. an ASCII character). It will also panic if the string is empty. You can choose to avoid these panics:
for line in lines_of_text.split("\n") {
match line.get(..1) {
Some("#") => println!("Heading"),
Some(">") => println!("Quotation"),
Some("-") => println!("Inline list"),
Some("`") => println!("Code"),
_ => println!("Other"),
};
}
We could use the method that is a direct match to your problem statement, str::starts_with:
for line in lines_of_text.split("\n") {
if line.starts_with('#') { println!("Heading") }
else if line.starts_with('>') { println!("Quotation") }
else if line.starts_with('-') { println!("Inline list") }
else if line.starts_with('`') { println!("Code") }
else { println!("Other") }
}
Note that this solution doesn't panic if the string is empty or if the first character isn't ASCII. I'd probably pick this solution for those reasons. Putting the if bodies on the same line as the if statement is not normal Rust style, but I put it that way to leave it consistent with the other examples. You should look to see how separating them onto different lines looks.
As an aside, you don't need collect::<Vec<_>>().iter(), this is just inefficient. There's no reason to take an iterator, build a vector from it, then iterate over the vector. Just use the original iterator.

Related

Reduce nested match statements with unwrap_or_else in Rust

I have a rust program that has multiple nested match statements as shown below.
match client.get(url).send() {
Ok(mut res) => {
match res.read_to_string(&mut s) {
Ok(m) => {
match get_auth(m) {
Ok(k) => k,
Err(_) => return Err(“a”);
}
},
Err(_) => {
return Err(“b”);
}
}
},
Err(_) => {
return Err(“c”);
},
};
All the variables k and m are of type String.I am looking for a way to make the code more readable by removing excessive nested match statements keeping the error handling intact since both the output and the error types are important for the problem.Is it possible to achieve this by unwrap_or_else?
The .map_err() utility converts a Result to have a new error type, leaving the success type alone. It accepts a closure that consumes the existing error value and returns the new one.
The ? operator will early-return the error in the Err case, and unwrap in the Ok case.
Combining these two allows you to express this same flow succinctly:
get_auth(
client.get(url).send().map_err(|_| "c")?
.read_to_string(&mut s).map_err(|_| "b")?
).map_err(|_| "a")?
(I suspect that you actually want to pass s to get_auth() but that's not what the code in your question does, so I'm choosing to represent the code you posted instead of imaginary code that I'm guessing about.)

Accept multiple values on proc macro attribute

I wanted to be able to retrieve the content from an attribute like this:
#[foreign_key(table = "some_table", column = "some_column")]
This is how I am trying:
impl TryFrom<&&Attribute> for EntityFieldAnnotation {
type Error = syn::Error;
fn try_from(attribute: &&Attribute) -> Result<Self, Self::Error> {
if attribute.path.is_ident("foreign_key") {
match attribute.parse_args()? {
syn::Meta::NameValue(nv) =>
println!("NAME VALUE: {:?}, {:?}, {:?}",
nv.path.get_ident(),
nv.eq_token.to_token_stream(),
nv.lit.to_token_stream(),
),
_ => println!("Not interesting")
}
} else {
println!("No foreign key")
}
// ... More Rust code
}
Everything works fine if I just put in there only one NameValue. When I add the comma,
everything brokes.
The only error:
error: unexpected token
How can I fix my logic to enable the possibility of have more than just one NameValue?
Thanks
UPDATE: While writing this answer, I had forgotten that Meta has List variant as well which gives you NestedMeta. I would generally prefer doing that instead of what I did in the answer below for more flexibility.
Although, for your particular case, using Punctuated still seems simpler and cleaner to me.
MetaNameValue represents only a single name-value pair. In your case it is delimited by ,, so, you need to parse all of those delimited values as MetaNameValue instead.
Instead of calling parse_args, you can use parse_args_with along with Punctuated::parse_terminated:
use syn::{punctuated::Punctuated, MetaNameValue, Token};
let name_values: Punctuated<MetaNameValue, Token![,]> = attribute.parse_args_with(Punctuated::parse_terminated).unwrap(); // handle error instead of unwrap
Above name_values has type Punctuated which is an iterator. You can iterate over it to get various MetaNameValue in your attribute.
Updates based on comments:
Getting value out as String from MetaNameValue:
let name_values: Result<Punctuated<MetaNameValue, Token![,]>, _> = attr.parse_args_with(Punctuated::parse_terminated);
match name_values {
Ok(name_value) => {
for nv in name_value {
println!("Meta NV: {:?}", nv.path.get_ident());
let value = match nv.lit {
syn::Lit::Str(v) => v.value(),
_ => panic!("expeced a string value"), // handle this err and don't panic
};
println!( "Meta VALUE: {:?}", value )
}
},
Err(_) => todo!(),
};

Zip iterables with Optional and Non Optional parameter in macro

For the testing part of my lexer, I came up with a simple macro that let met define the expected token type (enum) and the token literal (string):
macro_rules! token_test {
($($ttype:ident: $literal:literal)*) => {
{
vec!($($ttype,)*).iter().zip(vec!($($literal,)*).iter())
}
}
}
and then I can use it like this:
for (ttype, literal) in token_test! {
Let: "let" Identifier: "five" Assign: "=" Int: "5" Semicolon: ";"
} {
//...
}
However, this is a little bit verbose and we don't need to specify the literal for most of the token since I have another macro that transforms an enum variant into a string (eg: Let -> "let").
So what I hope to do is something like:
for (ttype, literal) in token_test! {
Let Identifier: "five" Assign Int: "5" Semicolon
} {
//...
}
And if I understood properly, I can use optional parameters to match either TYPE: LITERAL or TYPE. Maybe something like:
macro_rules! token_test {
($($ttype:ident$(: $literal:literal)?)*) => {
{
//...
}
}
}
So then my question is is there a way to build Vector out of this?
To be more clear:
In the case of no literal passed, it should add the string representation of my enum (eg: Let -> "let")
In the case of literal passed, it should add the literal directly
Made it work with the following macro (any improvement welcomed):
macro_rules! token_test {
($($ttype:ident$(: $literal:literal)?)*) => {
vec!($($ttype,)*).iter().zip(vec!(
$(
{
let mut literal = $ttype.as_str().unwrap();
$(literal = $literal;)?
literal
}
),*).iter())
}
}
This 'iterates' over the literal macro arguments and initially set the value of the as_str which transform a enum variant to a string. Then if the $literal is defined, it replaces the local literal value to that. And finally, it returns the local literal variable.
Improvement
macro_rules! some_or_none {
() => { None };
($entity:literal) => { Some($entity) }
}
macro_rules! token_test {
($($ttype:ident$(: $literal:literal)?)*) => {
vec!($($ttype,)*).iter().zip(vec!($(
some_or_none!($($literal)?).unwrap_or($ttype.as_str().unwrap())
),*))
}
}
Removed some unnecessary scopes, the second .iter(), and added some_or_none macro. With this way I don't need to do the as_str if there is a literal provided.
Further improvement
In the above example, there are two macros that are provided. One is clearly a "private" macro, because its existence is only useful for the implementation of the other one. However, there is a small catch about how macro exports work. Unlike functions, macros cannot access a macro that was defined in the same scope, but which are not accessible from the caller. See this playground example. This is not a problem if you don't intend to export that macro, which is possible since its only purpose is to be used in a test suite. However, you might still want to expose it publicly at a crate level, without exposing some_or_none!. The conventional way to do this is to integrate some_or_none! inside the token_test! macro, by prepending it with #:
macro_rules! token_test {
(#some_or_none) => {
None
};
(#some_or_none $entity:literal) => {
Some($entity)
};
($($ttype:ident $(: $literal:literal)?)*) => {
vec!($($ttype,)*)
.iter()
.zip(vec!($(
token_test!(#some_or_none $($literal)?)
.unwrap_or($ttype.as_str().unwrap())
),*))
};
}
With this version, you can safely export test_token without any fears as shown in this playground.
Little bit more
original idea from steffahn on the Rust Forum
There is another similar way to solve that and without involving unwrap_or, instead of wrapping into an Option in the some_or_none, we can actually create two branches that take either TYPE + LITERAL or TYPE, like so:
macro_rules! token_test {
(#ttype_or_literal $ttype:ident) => { $ttype.as_str().unwrap() };
(#ttype_or_literal $ttype:ident: $literal:literal) => { $literal };
($($ttype:ident $(: $literal:literal)?)*) => {
vec!($($ttype,)*)
.iter()
.zip(vec![$(token_test!(#ttype_or_literal $ttype$(: $literal)?)),*])
};
}
And again
As I only need an iterable than can be deconstructed as (type, iterable), an array of pair is enough:
macro_rules! token_test {
(#ttype_or_literal $ttype:ident) => { $ttype.as_str().unwrap() };
(#ttype_or_literal $ttype:ident: $literal:literal) => { $literal };
($($ttype:ident $(: $literal:literal)?)*) => {
[$(($ttype, token_test!(#ttype_or_literal $ttype$(: $literal)?))),*]
};
}
so no more vec and no more zip.
A Smart trick
A user on the Rust forum gave this potential trick involving ignoring the second argument if it exists. I made the solution a little bit more compact by not having two macros:
macro_rules! token_test {
(#ignore_second $value:expr $(, $_ignored:expr)? $(,)?) => { $value };
($($ttype:ident $(: $literal:literal)?)*) => {
[$(($ttype, token_test!(#ignore_second $($literal,)? $ttype.as_str().unwrap()))),*]
};
}

Call more than one function in match arm in Rust

I currently have a match statement in the form of
match ball.side {
Side::Left => x(),
Side::Right => y(),
}
But what I would need is something along the lines of
match ball.side {
Side::Left => x(),a(),
Side::Right => y(), b(),
}
And of course this does not compile, but how could I make this kind of sequence work?
I know I could also just work with an if-statement but I am curious how this can exactly be solved with match.
A sequence of statements in a block:
match ball.side {
Side::Left => {
x();
a();
}
Side::Right => {
y();
b();
}
}
Note that the right side of a match arm must be an expression, and that blocks are expressions (which can produce a value) in Rust.

Where will String::from("") be allocated in a match arm?

I am still very new to rust, coming from a C embedded world.
If i have a piece of code like this:
match self {
Command::AT => String::from("AT"),
Command::GetManufacturerId => String::from("AT+CGMI"),
Command::GetModelId => String::from("AT+CGMM"),
Command::GetFWVersion => String::from("AT+CGMR"),
Command::GetSerialNum => String::from("AT+CGSN"),
Command::GetId => String::from("ATI9"),
Command::SetGreetingText { ref enable, ref text } => {
if *enable {
if text.len() > 49 {
// TODO: Error!
}
write!(buffer, "AT+CSGT={},{}", *enable as u8, text).unwrap();
} else {
write!(buffer, "AT+CSGT={}", *enable as u8).unwrap();
}
buffer
},
Command::GetGreetingText => String::from("AT+CSGT?"),
Command::Store => String::from("AT&W0"),
Command::ResetDefault => String::from("ATZ0"),
Command::ResetFactory => String::from("AT+UFACTORY"),
Command::SetDTR { ref value } => {
write!(buffer, "AT&D{}", *value as u8).unwrap();
buffer
},
Command::SetDSR { ref value } => {
write!(buffer, "AT&S{}", *value as u8).unwrap();
buffer
},
Command::SetEcho { ref enable } => {
write!(buffer, "ATE{}", *enable as u8).unwrap();
buffer
},
Command::GetEcho => String::from("ATE?"),
Command::SetEscape { ref esc_char } => {
write!(buffer, "ATS2={}", esc_char).unwrap();
buffer
},
Command::GetEscape => String::from("ATS2?"),
Command::SetTermination { ref line_term } => {
write!(buffer, "ATS3={}", line_term).unwrap();
buffer
}
}
How does it work in Rust? Will all these match arms evaluate immediately, or will only the one matching create a mutable copy on the stack? And also, will all the string literals withing my String::from("") be allocated in .rodata?
Is there a better way of doing what i am trying to do here? Essentially i want to return a string literal, with replaced parameters (the write! macro bits)?
Best regards
Only the matching arm will be evaluated. The non matching arms have no cost apart the size of the program.
In the general case, it's not even possible to evaluate other arms, as they depend on data read using destructuring of the pattern.
As for your second question, the location in a program where literals are stored isn't commonly named rodata, and it's neither specified nor guaranteed (it's usually deduplicated but that's just optimization).

Resources