Match String Tuple in Rust - rust

This is one of those simple-but-I-don't-know-how-to-do-it-in-rust things.
Simply put:
pub fn pair_matcher(tup: &(String, String)) {
match tup {
&("foo".as_string(), "bar".as_string()) => print!("foobar"),
_ => print!("Unknown"),
}
}
I get the error
-:3:17: 3:18 error: expected `,`, found `.`
-:3 &("foo".as_string(),"bar".as_string())=> { print!("foobar"); }
^
How do you match this?

The left hand side of each branch of a match is not an expression, it is a pattern, which restricts what can go there to basically just literals (plus things like ref which change the binding behaviour); function calls are right out. Given how String works, it’s not possible to get one of them into a pattern (because you can’t construct one statically). It could be achieved with if statements:
if tup == ("foo".to_string(), "bar".to_string()) {
print!("foobar")
} else {
print!("Unknown")
}
… or by taking a slice of the Strings, yielding the type &str which can be constructed literally:
match (tup.0.as_slice(), tup.1.as_slice()) {
("foo", "bar") => print!("foobar"),
_ => print!("Unknown"),
}
Constructing a new String each time is an expensive way of doing things, while using the slices is pretty much free, entailing no allocations.
Note that the .0 and .1 requires #![feature(tuple_indexing)] on the crate; one can do without it thus:
let (ref a, ref b) = tup;
match (a.as_slice(), b.as_slice()) {
("foo", "bar") => print!("foobar"),
_ => print!("Unknown"),
}
Because, you see, the left hand side of a let statement is a pattern as well, and so you can pull the tuple apart with it, taking references to each element, with (ref a, ref b) yielding variables a and b both of type &String.
The Patterns section of the guide goes into some more detail on the subject.

The solution here is that you need to cast types in the other direction:
match (tup.0.as_slice(), tup.1.as_slice()) {
("foo", "bar") => print!("foobar"),
}

Related

How does match compiles with `continue` in its arms?

I'm reading The Rust Programming Language book and I stumbled upon a simple expression:
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
How does match work with different kinds of expressions in its arms? E.g. the first arm would simply "return" num so that it's assigned to guess but in the second arm the expression is simply continue. How does match handle that and doesn't "assign" continue to guess but executes it? What happens with the whole assignment expression itself? Is it dropped from the call stack (if that's the correct term)?
continue has a special type: it returns the never type, denoted !.
This type means "the code after that is unreachable". Since continue jumps to the next cycle of the loop, it'll never actually return any value (the same is true for return and break, and it's also the return type of panic!(), including all macros that panic: unreachable!(), todo!(), etc.).
The never type is special because it coerces (converts automatically) to any type (because if something cannot happen, we have no problems thinking about it as u32 or String or whatever - it will just not happen). This means it also unify with any other type, meaning the intersection of any type and ! is the other type.
match requires the expressions' type to unify (as does if). So your code returns the unification of ! and u32 == u32.
You can see that if you'll denote the type (requires nightly, since using the ! type not at return type position is experimental):
let num = match num {
Ok(num) => {
let num: i32 = num;
num
}
Err(()) => {
let never: ! = continue;
never
}
};
Playground.

How can I (slice) pattern match on an owned Vec with non-Copy elements?

My goal is to move elements out of an owned Vec.
fn f<F>(x: Vec<F>) -> F {
match x.as_slice() {
&[a, b] => a,
_ => panic!(),
}
}
If F is copy, that is no problem as one can simply copy out of the slice. When F is not, slice patterns seem a no-go, as the slice is read only.
Is there such a thing as an "owned slice", or pattern matching on a Vec, to move elements out of x?
Edit: I now see that this code has the more general problem. The function
fn f<T>(x: Vec<T>) -> T {
x[0]
}
leaves "a hole in a Vec", even though it is dropped right after. This is not allowed. This post and this discussion describe that problem.
That leads to the updated question: How can a Vec<T> be properly consumed to do pattern matching?
If you insist on pattern matching, you could do this:
fn f<F>(x: Vec<F>) -> F {
let mut it = x.into_iter();
match (it.next(), it.next(), it.next()) {
(Some(x0), Some(_x1), None) => x0,
_ => panic!(),
}
}
However, if you just want to retrieve the first element of a 2-element vector (panicking in other cases), I guess I'd rather go with this:
fn f<F>(x: Vec<F>) -> F {
assert_eq!(x.len(), 2);
x.into_iter().next().unwrap()
}
You can't use pattern matching with slice patterns in this scenario.
As you have correctly mentioned in your question edits, moving a value out of a Vec leaves it with uninitialized memory. This could then cause Undefined Behaviour when the Vec is subsequently dropped, because its Drop implementation needs to free the heap memory, and possibly drop each element.
There is currently no way to express that your type parameter F does not have a Drop implementation or that it is safe for it to be coerced from uninitialized memory.
You pretty much have to forget the idea of using a slice pattern and write it more explicitly:
fn f<F>(mut x: Vec<F>) -> F {
x.drain(..).next().unwrap()
}
If you are dead set on pattern matching, you can use Itertools::tuples() to match on tuples instead:
use itertools::Itertools; // 0.9.0
fn f<F>(mut x: Vec<F>) -> F {
match x.drain(..).tuples().next() {
Some((a, _)) => a,
None => panic!()
}
}
One way to achieve consuming a single element of a vector is to swap the last element with the element you want to consume, and then pop the last element
fn f<F>(mut x: Vec<F>) -> F {
match x.as_slice() {
[_a, _b] => {
x.swap(0, 1);
x.pop().unwrap() // returns a
},
_ => panic!(),
}
}
The code uses an unwrap which isn't elegant.

Why can I compare a String to a &str using if, but not when using match?

I'm trying to implement a function that reads command line arguments and compares them to hard-coded string literals.
When I do the comparison with an if statement it works like a charm:
fn main() {
let s = String::from("holla!");
if s == "holla!" {
println!("it worked!");
}
}
But using a match statement (which I guess would be more elegant):
fn main() {
let s = String::from("holla!");
match s {
"holla!" => println!("it worked!"),
_ => println!("nothing"),
}
}
I keep getting an error from the compiler that a String was expected but a &static str was found:
error[E0308]: mismatched types
--> src/main.rs:5:9
|
5 | "holla!" => println!("it worked!"),
| ^^^^^^^^ expected struct `std::string::String`, found reference
|
= note: expected type `std::string::String`
found type `&'static str`
I've seen How to match a String against string literals in Rust? so I know how to fix it, but I want to know why the comparison works when if but not using match.
I want to know why the comparison works when if but not using match.
It's not so much about if and more because you've used == in the condition. The condition in an if statement is any expression of type bool; you just happen to have chosen to use == there.
The == operator is really a function associated with the PartialEq trait. This trait can be implemented for any pair of types. And, for convenience, String has implementations for PartialEq<str> and PartialEq<&str>, among others - and vice versa.
On the other hand, match expressions use pattern matching for comparison, not ==. A &'static str literal, like "holla!", is a valid pattern, but it can never match a String, which is a completely different type.
Pattern matching lets you concisely compare parts of complex structures, even if the whole thing isn't equal, as well as bind variables to pieces of the match. While Strings don't really benefit from that, it's very powerful for other types, and has an entirely different purpose than ==.
Note that you can use pattern matching with if by instead using the if let construct. Your example would look like this:
if let "holla!" = &*s {
println!("it worked!");
}
Conversely, one way to use == inside a match is like this:
match s {
_ if s == "holla!" => println!("it worked!"),
_ => println!("nothing"),
}
Or, as #ljedrz suggested:
match s == "holla!" {
true => println!("it worked!"),
_ => println!("nothing")
}
As #peter-hall said, there's a mismatch of types because match expressions use pattern matching, which is different from the == that are associated with the PartialEq trait.
There a second way to resolve this issue, by casting your String into an &str (a string slice) :
match &s[..] {
"holla!" => println!("it worked!"),
"Hallo!" => println!("with easy to read matches !"),
_ => println!("nothing"),
}

Match shadowing example in the Patterns section of the Rust book is very perplexing

In learning Rust, I encountered the following in the official Rust book:
There’s one pitfall with patterns: like anything that introduces a new
binding, they introduce shadowing. For example:
let x = 'x';
let c = 'c';
match c {
x => println!("x: {} c: {}", x, c),
}
println!("x: {}", x)
This prints:
x: c c: c
x: x
In other words, x => matches the pattern and introduces a new binding
named x that’s in scope for the match arm. Because we already have a
binding named x, this new x shadows it.
I don't understand two things:
Why does the match succeed?
Shouldn't the differing value of c and x cause this to fail?
How does the match arm x binding get set to 'c'?
Is that somehow the return of the println! expression?
There is a fundamental misconception of what match is about.
Pattern-matching is NOT about matching on values but about matching on patterns, as the name imply. For convenience and safety, it also allows binding names to the innards of the matched pattern:
match some_option {
Some(x) => println!("Some({})", x),
None => println!("None"),
}
For convenience, match is extended to match the values when matching specifically against literals (integrals or booleans), which I think is at the root of your confusion.
Why? Because a match must be exhaustive!
match expressions are there so the compiler can guarantee that you handle all possibilities; checking that you handle all patterns is easy because they are under the compiler's control, checking that you handle all values is hard in the presence of custom equality operators.
When using just a name in the match clause, you create an irrefutable pattern: a pattern that cannot fail, ever. In this case, the entire value being matched is bound to this name.
You can exhibit this by adding a second match clause afterward, the compiler will warn that the latter binding is unreachable:
fn main() {
let x = 42;
match x {
name => println!("{}", name),
_ => println!("Other"),
};
}
<anon>:6:5: 6:6 error: unreachable pattern [E0001]
<anon>:6 _ => println!("Other"),
^
Combined with the shadowing rules, which specifically allow hiding a binding in a scope by reusing its name to bind another value, you get the example:
within the match arm, x is bound to the value of 'c'
after the arm, the only x in scope is the original one bound to the value 'x'
Your two points are caused by the same root problem. Coincidentally, the reason that this section exists is to point out the problem you asking about! I'm afraid that I'm basically going to regurgitate what the book says, with different words.
Check out this sample:
match some_variable {
a_name => {},
}
In this case, the match arm will always succeed. Regardless of the value in some_variable, it will always be bound to the name a_name inside that match arm. It's important to get this part first — the name of the variable that is bound has no relation to anything outside of the match.
Now we turn to your example:
match c {
x => println!("x: {} c: {}", x, c),
}
The exact same logic applies. The match arm with always match, and regardless of the value of c, it will always be bound to the name x inside the arm.
The value of x from the outer scope ('x' in this case) has no bearing whatsoever in a pattern match.
If you wanted to use the value of x to control the pattern match, you can use a match guard:
match c {
a if a == x => println!("yep"),
_ => println!("nope"),
}
Note that in the match guard (if a == x), the variable bindings a and x go back to acting like normal variables that you can test.

Equivalent of Cons Pattern from F# in Rust for Strings

I am experimenting with Rust by implementing a small F# snippet of mine.
I am at the point where I want to destructure a string of characters. Here is the F#:
let rec internalCheck acc = function
| w :: tail when Char.IsWhiteSpace(w) ->
internalCheck acc tail
| other
| matches
| here
..which can be called like this: internalCheck [] "String here" where the :: operator signifies the right hand side is the "rest of the list".
So I checked the Rust documentation and there are examples for destructuring vectors like this:
let v = vec![1,2,3];
match v {
[] => ...
[first, second, ..rest] => ...
}
..etc. However this is now behind the slice_patterns feature gate. I tried something similar to this:
match input.chars() {
[w, ..] => ...
}
Which informed me that feature gates require non-stable releases to use.
So I downloaded multirust and installed the latest nightly I could find (2016-01-05) and when I finally got the slice_patterns feature working ... I ran into endless errors regarding syntax and "rest" (in the above example) not being allowed.
So, is there an equivalent way to destructure a string of characters, utilizing ::-like functionality ... in Rust? Basically I want to match 1 character with a guard and use "everything else" in the expression that follows.
It is perfectly acceptable if the answer is "No, there isn't". I certainly cannot find many examples of this sort online anywhere and the slice pattern matching doesn't seem to be high on the feature list.
(I will happily delete this question if there is something I missed in the Rust documentation)
You can use the pattern matching with a byte slice:
#![feature(slice_patterns)]
fn internal_check(acc: &[u8]) -> bool {
match acc {
&[b'-', ref tail..] => internal_check(tail),
&[ch, ref tail..] if (ch as char).is_whitespace() => internal_check(tail),
&[] => true,
_ => false,
}
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}", s, internal_check(s.as_bytes()));
}
}
You can use it with a char slice (where char is a Unicode Scalar Value):
#![feature(slice_patterns)]
fn internal_check(acc: &[char]) -> bool {
match acc {
&['-', ref tail..] => internal_check(tail),
&[ch, ref tail..] if ch.is_whitespace() => internal_check(tail),
&[] => true,
_ => false,
}
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}",
s, internal_check(&s.chars().collect::<Vec<char>>()));
}
}
But as of now it doesn't work with a &str (producing E0308). Which I think is for the best since &str is neither here nor there, it's a byte slice under the hood but Rust tries to guarantee that it's a valid UTF-8 and tries to remind you to work with &str in terms of unicode sequences and characters rather than bytes. So to efficiently match on the &str we have to explicitly use the as_bytes method, essentially telling Rust that "we know what we're doing".
That's my reading, anyway. If you want to dig deeper and into the source code of the Rust compiler you might start with issue 1844 and browse the commits and issues linked there.
Basically I want to match 1 character with a guard and use "everything
else" in the expression that follows.
If you only want to match on a single character then using the chars iterator to get the characters and matching on the character itself might be better than converting the entire UTF-8 &str into a &[char] slice. For instance, with the chars iterator you don't have to allocate the memory for the characters array.
fn internal_check(acc: &str) -> bool {
for ch in acc.chars() {
match ch {
'-' => (),
ch if ch.is_whitespace() => (),
_ => return false,
}
}
return true;
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}", s, internal_check(s));
}
}
You can also use the chars iterator to split the &str on the Unicode Scalar Value boundary:
fn internal_check(acc: &str) -> bool {
let mut chars = acc.chars();
match chars.next() {
Some('-') => internal_check(chars.as_str()),
Some(ch) if ch.is_whitespace() => internal_check(chars.as_str()),
None => true,
_ => false,
}
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}", s, internal_check(s));
}
}
But keep in mind that as of now Rust provides no guarantees of optimizing this tail-recursive function into a loop. (Tail call optimization would've been a welcome addition to the language but it wasn't implemented so far due to LLVM-related difficulties).
I don't believe so. Slice patterns aren't likely to be amenable to this, either, since the "and the rest" part of the pattern goes inside the array pattern, which would imply some way of putting said pattern inside a string, which implies an escaping mechanism that doesn't exist.
In addition, Rust doesn't have a proper "concatenation" operator, and the operators it does have can't participate in destructuring. So, I wouldn't hold your breath on this one.
Just going to post this here... it seems to do what I want. As a simple test, this will just print every character in a string but print Found a whitespace character when it finds a whitespace character. It does this recursively and destructuring a vector of bytes. I must give a shout out to #ArtemGr who gave me the inspiration to look at working with bytes to see if that fixed the compiler issues I was having with chars.
There are no doubt memory issues I am unaware of as yet here (copying/allocations, etc; especially around the String instances)... but I'll work on those as I dig deeper in to the inner workings of Rust. It's also probably much more verbose than it needs to be.. this is just where I got to after a little tinkering.
#![feature(slice_patterns)]
use std::iter::FromIterator;
use std::vec::Vec;
fn main() {
process("Hello world!".to_string());
}
fn process(input: String) {
match input.as_bytes() {
&[c, ref _rest..] if (c as char).is_whitespace() => { println!("Found a whitespace character"); process(string_from_rest(_rest)) },
&[c, ref _rest..] => { println!("{}", c as char); process(string_from_rest(_rest)) },
_ => ()
}
}
fn string_from_rest(rest: &[u8]) -> String {
String::from_utf8(Vec::from_iter(rest.iter().cloned())).unwrap()
}
Output:
H
e
l
l
o
Found a whitespace character
w
o
r
l
d
!
Obviously, as its testing against individual bytes (and only considering possible UTF-8 characters when rebuilding the string), its not going to work with wide characters. My actual use case only requires characters in the ASCII space .. so this is sufficient for now.
I guess, to work on wider characters the Rust pattern matching would require the ability to type coerce (which I don't believe you can do currently?), since a Chars<'T> iterator seems to be inferred as &[_]. That could just be my immaturity with the Rust language though during my other attempts.

Resources