Equivalent of Cons Pattern from F# in Rust for Strings

Equivalent of Cons Pattern from F# in Rust for Strings - rust

I am experimenting with Rust by implementing a small F# snippet of mine.
I am at the point where I want to destructure a string of characters. Here is the F#:
let rec internalCheck acc = function
| w :: tail when Char.IsWhiteSpace(w) ->
internalCheck acc tail
| other
| matches
| here
..which can be called like this: internalCheck [] "String here" where the :: operator signifies the right hand side is the "rest of the list".
So I checked the Rust documentation and there are examples for destructuring vectors like this:
let v = vec![1,2,3];
match v {
[] => ...
[first, second, ..rest] => ...
}
..etc. However this is now behind the slice_patterns feature gate. I tried something similar to this:
match input.chars() {
[w, ..] => ...
}
Which informed me that feature gates require non-stable releases to use.
So I downloaded multirust and installed the latest nightly I could find (2016-01-05) and when I finally got the slice_patterns feature working ... I ran into endless errors regarding syntax and "rest" (in the above example) not being allowed.
So, is there an equivalent way to destructure a string of characters, utilizing ::-like functionality ... in Rust? Basically I want to match 1 character with a guard and use "everything else" in the expression that follows.
It is perfectly acceptable if the answer is "No, there isn't". I certainly cannot find many examples of this sort online anywhere and the slice pattern matching doesn't seem to be high on the feature list.
(I will happily delete this question if there is something I missed in the Rust documentation)

You can use the pattern matching with a byte slice:
#![feature(slice_patterns)]
fn internal_check(acc: &[u8]) -> bool {
match acc {
&[b'-', ref tail..] => internal_check(tail),
&[ch, ref tail..] if (ch as char).is_whitespace() => internal_check(tail),
&[] => true,
_ => false,
}
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}", s, internal_check(s.as_bytes()));
}
}
You can use it with a char slice (where char is a Unicode Scalar Value):
#![feature(slice_patterns)]
fn internal_check(acc: &[char]) -> bool {
match acc {
&['-', ref tail..] => internal_check(tail),
&[ch, ref tail..] if ch.is_whitespace() => internal_check(tail),
&[] => true,
_ => false,
}
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}",
s, internal_check(&s.chars().collect::<Vec<char>>()));
}
}
But as of now it doesn't work with a &str (producing E0308). Which I think is for the best since &str is neither here nor there, it's a byte slice under the hood but Rust tries to guarantee that it's a valid UTF-8 and tries to remind you to work with &str in terms of unicode sequences and characters rather than bytes. So to efficiently match on the &str we have to explicitly use the as_bytes method, essentially telling Rust that "we know what we're doing".
That's my reading, anyway. If you want to dig deeper and into the source code of the Rust compiler you might start with issue 1844 and browse the commits and issues linked there.
Basically I want to match 1 character with a guard and use "everything
else" in the expression that follows.
If you only want to match on a single character then using the chars iterator to get the characters and matching on the character itself might be better than converting the entire UTF-8 &str into a &[char] slice. For instance, with the chars iterator you don't have to allocate the memory for the characters array.
fn internal_check(acc: &str) -> bool {
for ch in acc.chars() {
match ch {
'-' => (),
ch if ch.is_whitespace() => (),
_ => return false,
}
}
return true;
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}", s, internal_check(s));
}
}
You can also use the chars iterator to split the &str on the Unicode Scalar Value boundary:
fn internal_check(acc: &str) -> bool {
let mut chars = acc.chars();
match chars.next() {
Some('-') => internal_check(chars.as_str()),
Some(ch) if ch.is_whitespace() => internal_check(chars.as_str()),
None => true,
_ => false,
}
}
fn main() {
for s in ["foo", "bar", " ", " - "].iter() {
println!("text '{}', checks? {}", s, internal_check(s));
}
}
But keep in mind that as of now Rust provides no guarantees of optimizing this tail-recursive function into a loop. (Tail call optimization would've been a welcome addition to the language but it wasn't implemented so far due to LLVM-related difficulties).

I don't believe so. Slice patterns aren't likely to be amenable to this, either, since the "and the rest" part of the pattern goes inside the array pattern, which would imply some way of putting said pattern inside a string, which implies an escaping mechanism that doesn't exist.
In addition, Rust doesn't have a proper "concatenation" operator, and the operators it does have can't participate in destructuring. So, I wouldn't hold your breath on this one.

Just going to post this here... it seems to do what I want. As a simple test, this will just print every character in a string but print Found a whitespace character when it finds a whitespace character. It does this recursively and destructuring a vector of bytes. I must give a shout out to #ArtemGr who gave me the inspiration to look at working with bytes to see if that fixed the compiler issues I was having with chars.
There are no doubt memory issues I am unaware of as yet here (copying/allocations, etc; especially around the String instances)... but I'll work on those as I dig deeper in to the inner workings of Rust. It's also probably much more verbose than it needs to be.. this is just where I got to after a little tinkering.
#![feature(slice_patterns)]
use std::iter::FromIterator;
use std::vec::Vec;
fn main() {
process("Hello world!".to_string());
}
fn process(input: String) {
match input.as_bytes() {
&[c, ref _rest..] if (c as char).is_whitespace() => { println!("Found a whitespace character"); process(string_from_rest(_rest)) },
&[c, ref _rest..] => { println!("{}", c as char); process(string_from_rest(_rest)) },
_ => ()
}
}
fn string_from_rest(rest: &[u8]) -> String {
String::from_utf8(Vec::from_iter(rest.iter().cloned())).unwrap()
}
Output:
H
e
l
l
o
Found a whitespace character
w
o
r
l
d
!
Obviously, as its testing against individual bytes (and only considering possible UTF-8 characters when rebuilding the string), its not going to work with wide characters. My actual use case only requires characters in the ASCII space .. so this is sufficient for now.
I guess, to work on wider characters the Rust pattern matching would require the ability to type coerce (which I don't believe you can do currently?), since a Chars<'T> iterator seems to be inferred as &[_]. That could just be my immaturity with the Rust language though during my other attempts.

Related

How can I (slice) pattern match on an owned Vec with non-Copy elements?

My goal is to move elements out of an owned Vec.
fn f<F>(x: Vec<F>) -> F {
match x.as_slice() {
&[a, b] => a,
_ => panic!(),
}
}
If F is copy, that is no problem as one can simply copy out of the slice. When F is not, slice patterns seem a no-go, as the slice is read only.
Is there such a thing as an "owned slice", or pattern matching on a Vec, to move elements out of x?
Edit: I now see that this code has the more general problem. The function
fn f<T>(x: Vec<T>) -> T {
x[0]
}
leaves "a hole in a Vec", even though it is dropped right after. This is not allowed. This post and this discussion describe that problem.
That leads to the updated question: How can a Vec<T> be properly consumed to do pattern matching?

If you insist on pattern matching, you could do this:
fn f<F>(x: Vec<F>) -> F {
let mut it = x.into_iter();
match (it.next(), it.next(), it.next()) {
(Some(x0), Some(_x1), None) => x0,
_ => panic!(),
}
}
However, if you just want to retrieve the first element of a 2-element vector (panicking in other cases), I guess I'd rather go with this:
fn f<F>(x: Vec<F>) -> F {
assert_eq!(x.len(), 2);
x.into_iter().next().unwrap()
}

You can't use pattern matching with slice patterns in this scenario.
As you have correctly mentioned in your question edits, moving a value out of a Vec leaves it with uninitialized memory. This could then cause Undefined Behaviour when the Vec is subsequently dropped, because its Drop implementation needs to free the heap memory, and possibly drop each element.
There is currently no way to express that your type parameter F does not have a Drop implementation or that it is safe for it to be coerced from uninitialized memory.
You pretty much have to forget the idea of using a slice pattern and write it more explicitly:
fn f<F>(mut x: Vec<F>) -> F {
x.drain(..).next().unwrap()
}
If you are dead set on pattern matching, you can use Itertools::tuples() to match on tuples instead:
use itertools::Itertools; // 0.9.0
fn f<F>(mut x: Vec<F>) -> F {
match x.drain(..).tuples().next() {
Some((a, _)) => a,
None => panic!()
}
}

One way to achieve consuming a single element of a vector is to swap the last element with the element you want to consume, and then pop the last element
fn f<F>(mut x: Vec<F>) -> F {
match x.as_slice() {
[_a, _b] => {
x.swap(0, 1);
x.pop().unwrap() // returns a
},
_ => panic!(),
}
}
The code uses an unwrap which isn't elegant.

How to test if a string contains each character in a pattern in order?

I'm trying to port this Python function that returns true if each character in the pattern appears in the test string in order.
def substr_match(pattern, document):
p_idx, d_idx, p_len, d_len = 0, 0, len(pattern), len(document)
while (p_idx != p_len) and (d_idx != d_len):
if pattern[p_idx].lower() == document[d_idx].lower():
p_idx += 1
d_idx += 1
return p_len != 0 and d_len != 0 and p_idx == p_len
This is what I have at the moment.
fn substr_match(pattern: &str, document: &str) -> bool {
let mut pattern_idx = 0;
let mut document_idx = 0;
let pattern_len = pattern.len();
let document_len = document.len();
while (pattern_idx != pattern_len) && (document_idx != document_len) {
let pat: Vec<_> = pattern.chars().nth(pattern_idx).unwrap().to_lowercase().collect();
let doc: Vec<_> = document.chars().nth(document_idx).unwrap().to_lowercase().collect();
if pat == doc {
pattern_idx += 1;
}
document_idx += 1;
}
return pattern_len != 0 && document_len != 0 && pattern_idx == pattern_len;
}
I tried s.chars().nth(n) since Rust doesn't seem to allow string indexing, but I feel there is a more idiomatic way of doing it. What would be the preferred way of writing this in Rust?

Here is mine:
fn substr_match(pattern: &str, document: &str) -> bool {
let pattern_chars = pattern.chars().flat_map(char::to_lowercase);
let mut doc_chars = document.chars().flat_map(char::to_lowercase);
'outer: for p in pattern_chars {
for d in &mut doc_chars {
if d == p {
continue 'outer;
}
}
return false;
}
true
}

The other answers mimic the behavior of the Python function you started with, but it may be worth trying to make it better. I thought of two test cases where the original function may have surprising behavior:
>>> substr_match("ñ", "in São Paulo")
True
>>> substr_match("🇺🇸", "🇺🇦🇸🇰")
True
Hmm.
(The first example may depend on your input method; try copying and pasting. Also, if you can't see them, the special characters in the second example are flag emoji for the United States, Ukraine, and Slovakia.)
Without getting into why these tests fail or all the other things that could potentially be undesired, if you want to correctly handle Unicode text, you need to, at minimum, operate on graphemes instead of code points (this question describes the difference). Rust doesn't provide this feature in the standard library, so you need the unicode-segmentation crate, which provides a graphemes method on str.
extern crate unicode_segmentation;
use unicode_segmentation::UnicodeSegmentation;
fn substr_match(pattern: &str, document: &str) -> bool {
let mut haystack = document.graphemes(true);
pattern.len() > 0 && pattern.graphemes(true).all(|needle| {
haystack
.find(|grapheme| {
grapheme
.chars()
.flat_map(char::to_lowercase)
.eq(needle.chars().flat_map(char::to_lowercase))
})
.is_some()
})
}
Playground, test cases provided.
This algorithm takes advantage of several convenience methods on Iterator. all iterates over the pattern. find short-circuits, so whenever it finds the next needle in haystack, the next call to haystack.find will start at the following element.
(I thought this approach was somewhat clever, but honestly, a nested for loop is probably easier to read, so you might prefer that.)
The last "tricky" bit is case-insensitive string comparison, which is inherently language-dependent, but if you're willing to accept only unconditional mappings (those that apply in any language), char::to_lowercase does the trick. Rather than collect the result into a String, though, you can use Iterator::eq to compare the sequences of (lowercased) characters.
One other thing you may want to consider is Unicode normalization -- this question is a good place for the broad strokes. Fortunately, Rust has a unicode-normalization crate, too! And it looks quite easy to use. (You wouldn't necessarily want to use it in this function, though; instead, you might normalize all text on input so that you're dealing with the same normalization form everywhere in your program.)

str::chars() returns an iterator. Iterators return elements from a sequence one at a time. Specifically, str::chars() returns characters from a string one at a time. It's much more efficient to use a single iterator to iterate over a string than to create a new iterator each time you want to look up a character, because s.chars().nth(n) needs to perform a linear scan in order to find the nth character in the UTF-8 encoded string.
fn substr_match(pattern: &str, document: &str) -> bool {
let mut pattern_iter = pattern.chars();
let mut pattern_ch_lower: String = match pattern_iter.next() {
Some(ch) => ch,
None => return false,
}.to_lowercase().collect();
for document_ch in document.chars() {
let document_ch_lower: String = document_ch.to_lowercase().collect();
if pattern_ch_lower == document_ch_lower {
pattern_ch_lower = match pattern_iter.next() {
Some(ch) => ch,
None => return true,
}.to_lowercase().collect();
}
}
return false;
}
Here, I'm demonstrating two ways of using iterators:
To iterate over the pattern, I'm using the next method manually. next returns an Option: Some(value) if the iterator hasn't finished, or None if it has.
To iterate over the document, I'm using a for loop. The for loop does the work of calling next and unwrapping the result until next returns None.
One thing to notice is that I'm using a return expression inside a match expression (twice). Since a return expression doesn't produce a value, the compiler knows that its type doesn't matter. In this case, on the Some arm, the result is a char, so the whole match evaluates to a char.
We could also do this with two nested for loops:
fn substr_match(pattern: &str, document: &str) -> bool {
if pattern.len() == 0 {
return false;
}
let mut document_iter = document.chars();
for pattern_ch in pattern.chars() {
let pattern_ch_lower: String = pattern_ch.to_lowercase().collect();
for document_ch in &mut document_iter {
let document_ch_lower: String = document_ch.to_lowercase().collect();
if pattern_ch_lower == document_ch_lower {
break;
}
}
return false;
}
return true;
}
There are two things to notice here:
We need to handle the case where the pattern is empty without using the iterator.
In the inner loop, we don't want to restart from the start of the document when we move to the next pattern character, so we need to reuse the same iterator over the document. When we write for x in iter, the for loop takes ownership of iter; to avoid that, we must write &mut iter instead. Mutable references to iterators are iterators themselves, thanks to the blanket implementation impl<'a, I> Iterator for &'a mut I where I: Iterator + ?Sized in the standard library.

Why is capitalizing the first letter of a string so convoluted in Rust?

I'd like to capitalize the first letter of a &str. It's a simple problem and I hope for a simple solution. Intuition tells me to do something like this:
let mut s = "foobar";
s[0] = s[0].to_uppercase();
But &strs can't be indexed like this. The only way I've been able to do it seems overly convoluted. I convert the &str to an iterator, convert the iterator to a vector, upper case the first item in the vector, which creates an iterator, which I index into, creating an Option, which I unwrap to give me the upper-cased first letter. Then I convert the vector into an iterator, which I convert into a String, which I convert to a &str.
let s1 = "foobar";
let mut v: Vec<char> = s1.chars().collect();
v[0] = v[0].to_uppercase().nth(0).unwrap();
let s2: String = v.into_iter().collect();
let s3 = &s2;
Is there an easier way than this, and if so, what? If not, why is Rust designed this way?
Similar question

Why is it so convoluted?
Let's break it down, line-by-line
let s1 = "foobar";
We've created a literal string that is encoded in UTF-8. UTF-8 allows us to encode the 1,114,112 code points of Unicode in a manner that's pretty compact if you come from a region of the world that types in mostly characters found in ASCII, a standard created in 1963. UTF-8 is a variable length encoding, which means that a single code point might take from 1 to 4 bytes. The shorter encodings are reserved for ASCII, but many Kanji take 3 bytes in UTF-8.
let mut v: Vec<char> = s1.chars().collect();
This creates a vector of characters. A character is a 32-bit number that directly maps to a code point. If we started with ASCII-only text, we've quadrupled our memory requirements. If we had a bunch of characters from the astral plane, then maybe we haven't used that much more.
v[0] = v[0].to_uppercase().nth(0).unwrap();
This grabs the first code point and requests that it be converted to an uppercase variant. Unfortunately for those of us who grew up speaking English, there's not always a simple one-to-one mapping of a "small letter" to a "big letter". Side note: we call them upper- and lower-case because one box of letters was above the other box of letters back in the day.
This code will panic when a code point has no corresponding uppercase variant. I'm not sure if those exist, actually. It could also semantically fail when a code point has an uppercase variant that has multiple characters, such as the German ß. Note that ß may never actually be capitalized in The Real World, this is the just example I can always remember and search for. As of 2017-06-29, in fact, the official rules of German spelling have been updated so that both "ẞ" and "SS" are valid capitalizations!
let s2: String = v.into_iter().collect();
Here we convert the characters back into UTF-8 and require a new allocation to store them in, as the original variable was stored in constant memory so as to not take up memory at run time.
let s3 = &s2;
And now we take a reference to that String.
It's a simple problem
Unfortunately, this is not true. Perhaps we should endeavor to convert the world to Esperanto?
I presume char::to_uppercase already properly handles Unicode.
Yes, I certainly hope so. Unfortunately, Unicode isn't enough in all cases.
Thanks to huon for pointing out the Turkish I, where both the upper (İ) and lower case (i) versions have a dot. That is, there is no one proper capitalization of the letter i; it depends on the locale of the the source text as well.
why the need for all data type conversions?
Because the data types you are working with are important when you are worried about correctness and performance. A char is 32-bits and a string is UTF-8 encoded. They are different things.
indexing could return a multi-byte, Unicode character
There may be some mismatched terminology here. A char is a multi-byte Unicode character.
Slicing a string is possible if you go byte-by-byte, but the standard library will panic if you are not on a character boundary.
One of the reasons that indexing a string to get a character was never implemented is because so many people misuse strings as arrays of ASCII characters. Indexing a string to set a character could never be efficient - you'd have to be able to replace 1-4 bytes with a value that is also 1-4 bytes, causing the rest of the string to bounce around quite a lot.
to_uppercase could return an upper case character
As mentioned above, ß is a single character that, when capitalized, becomes two characters.
Solutions
See also trentcl's answer which only uppercases ASCII characters.
Original
If I had to write the code, it'd look like:
fn some_kind_of_uppercase_first_letter(s: &str) -> String {
let mut c = s.chars();
match c.next() {
None => String::new(),
Some(f) => f.to_uppercase().chain(c).collect(),
}
}
fn main() {
println!("{}", some_kind_of_uppercase_first_letter("joe"));
println!("{}", some_kind_of_uppercase_first_letter("jill"));
println!("{}", some_kind_of_uppercase_first_letter("von Hagen"));
println!("{}", some_kind_of_uppercase_first_letter("ß"));
}
But I'd probably search for uppercase or unicode on crates.io and let someone smarter than me handle it.
Improved
Speaking of "someone smarter than me", Veedrac points out that it's probably more efficient to convert the iterator back into a slice after the first capital codepoints are accessed. This allows for a memcpy of the rest of the bytes.
fn some_kind_of_uppercase_first_letter(s: &str) -> String {
let mut c = s.chars();
match c.next() {
None => String::new(),
Some(f) => f.to_uppercase().collect::<String>() + c.as_str(),
}
}

Is there an easier way than this, and if so, what? If not, why is Rust designed this way?
Well, yes and no. Your code is, as the other answer pointed out, not correct, and will panic if you give it something like བོད་སྐད་ལ་. So doing this with Rust's standard library is even harder than you initially thought.
However, Rust is designed to encourage code reuse and make bringing in libraries easy. So the idiomatic way to capitalize a string is actually quite palatable:
extern crate inflector;
use inflector::Inflector;
let capitalized = "some string".to_title_case();

It's not especially convoluted if you are able to limit your input to ASCII-only strings.
Since Rust 1.23, str has a make_ascii_uppercase method (in older Rust versions, it was available through the AsciiExt trait). This means you can uppercase ASCII-only string slices with relative ease:
fn make_ascii_titlecase(s: &mut str) {
if let Some(r) = s.get_mut(0..1) {
r.make_ascii_uppercase();
}
}
This will turn "taylor" into "Taylor", but it won't turn "édouard" into "Édouard". (playground)
Use with caution.

I did it this way:
fn str_cap(s: &str) -> String {
format!("{}{}", (&s[..1].to_string()).to_uppercase(), &s[1..])
}
If it is not an ASCII string:
fn str_cap(s: &str) -> String {
format!("{}{}", s.chars().next().unwrap().to_uppercase(),
s.chars().skip(1).collect::<String>())
}

The OP's approach taken further:
replace the first character with its uppercase representation
let mut s = "foobar".to_string();
let r = s.remove(0).to_uppercase().to_string() + &s;
or
let r = format!("{}{s}", s.remove(0).to_uppercase());
println!("{r}");
works with Unicode characters as well eg. "😎foobar"
The first guaranteed to be an ASCII character, can changed to a capital letter in place:
let mut s = "foobar".to_string();
if !s.is_empty() {
s[0..1].make_ascii_uppercase(); // Foobar
}
Panics with a non ASCII character in first position!

Since the method to_uppercase() returns a new string, you should be able to just add the remainder of the string like so.
this was tested in rust version 1.57+ but is likely to work in any version that supports slice.
fn uppercase_first_letter(s: &str) -> String {
s[0..1].to_uppercase() + &s[1..]
}

Here's a version that is a bit slower than #Shepmaster's improved version, but also more idiomatic:
fn capitalize_first(s: &str) -> String {
let mut chars = s.chars();
chars
.next()
.map(|first_letter| first_letter.to_uppercase())
.into_iter()
.flatten()
.chain(chars)
.collect()
}

This is how I solved this problem, notice I had to check if self is not ascii before transforming to uppercase.
trait TitleCase {
fn title(&self) -> String;
}
impl TitleCase for &str {
fn title(&self) -> String {
if !self.is_ascii() || self.is_empty() {
return String::from(*self);
}
let (head, tail) = self.split_at(1);
head.to_uppercase() + tail
}
}
pub fn main() {
println!("{}", "bruno".title());
println!("{}", "b".title());
println!("{}", "🦀".title());
println!("{}", "ß".title());
println!("{}", "".title());
println!("{}", "བོད་སྐད་ལ".title());
}
Output
Bruno
B
🦀
ß
བོད་སྐད་ལ

Inspired by get_mut examples I code something like this:
fn make_capital(in_str : &str) -> String {
let mut v = String::from(in_str);
v.get_mut(0..1).map(|s| { s.make_ascii_uppercase(); &*s });
v
}

What is the "standard" way to concatenate strings?

While I understand basically what str and std::string::String are and how they relate to each other, I find it a bit cumbersome to compose strings out of various parts without spending too much time and thought on it. So as usual I suspect I did not see the proper way to do it yet, which makes it intuitive and a breeze.
let mut s = std::string::String::with_capacity(200);
let precTimeToJSON = | pt : prectime::PrecTime, isLast : bool | {
s.push_str(
"{ \"sec\": "
+ &(pt.sec.to_string())
+ " \"usec\": "
+ &(pt.usec.to_string())
+ if isLast {"}"} else {"},"})
};
The code above is honored by the compiler with error messages like:
src\main.rs:25:20: 25:33 error: binary operation + cannot be applied to type &'static str [E0369]
And even after half an hours worth of fiddling and randomly adding &, I could not make this compilable. So, here my questions:
What do I have to write to achieve the obvious?
What is the "standard" way to do this in Rust?

The Rust compiler is right (of course): there's no + operator for string literals.
I believe the format!() macro is the idiomatic way to do what you're trying to do. It uses the std::fmt syntax, which essentially consists of a formatting string and the arguments to format (a la C's printf). For your example, it would look something like this:
let mut s: String = String::new();
let precTimeToJSON = | pt : prectime::PrecTime, isLast : bool | {
s = format!("{{ \"sec\": {} \"usec\": {} }}{}",
pt.sec,
pt.usec,
if isLast { "" } else { "," }
)
};
Because it's a macro, you can intermix types in the argument list freely, so long as the type implements the std::fmt::Display trait (which is true for all built-in types). Also, you must escape literal { and } as {{ and }}, respectively. Last, note that the format string must be a string literal, because the macro parses it and the expanded code looks nothing like the original format! expression.
Here's a playground link to the above example.
Two more points for you. First, if you're reading and writing JSON, have a look at a library such as serde. It's much less painful!
Second, if you just want to concatenate &'static str strings (that is, string literals), you can do that with zero run-time cost with the concat!() macro. It won't help you in your case above, but it might with other similar ones.

Itertools::format can help you write this as a single expression if you really want to.
let times: Vec<PrecTime>; // iterable of PrecTime
let s = format!("{}", times.iter().format(",", |pt, f|
f(&format_args!(r#"{{ "sec": {}, "usec": {} }}"#, pt.sec, pt.usec))
));
format() uses a separator, so just specify "," there (or "" if you need no separator). It's a bit involved so that the formatting can be completely lazy and composable. You receive a callback f that you call back with a &Display value (anything that can be Display formatted).
Here we demonstrate this great trick of using &format_args!() to construct a displayable value. This is something that comes in handy if you use the debug builder API as well.
Finally, use a raw string so that we don't need to escape the inner " in the format: r#"{{ "sec": {} "usec": {} }}"#. Raw strings are delimited by r#" and "# (free choice of number of #).
Itertools::format() uses no intermediate allocations, it is all directly passed on to the underlying formatter object.

You can also do this madness:
fn main() {
let mut s = std::string::String::with_capacity(200);
// Have to put this in a block so precTimeToJSON is dropped, see https://doc.rust-lang.org/book/closures.html
{
// I have no idea why this has to be mut...
let mut precTimeToJSON = |sec: u64, usec: u64, isLast: bool| {
s.push_str(&( // Coerce String to str. See https://doc.rust-lang.org/book/deref-coercions.html
"{ \"sec\": ".to_string() // String
+ &sec.to_string() // + &str (& coerces a String to a &str).
+ " \"usec\": " // + &str
+ &usec.to_string() // + &str
+ if isLast {"}"} else {"},"} // + &str
));
};
precTimeToJSON(30, 20, false);
}
println!("{}", &s);
}
Basically the operator String + &str -> String is defined, so you can do String + &str + &str + &str + &str. That gives you a String which you have to coerce back to a &str using &. I think this way is probably quite inefficient though as it will (possibly) allocate loads of Strings.

Match String Tuple in Rust

This is one of those simple-but-I-don't-know-how-to-do-it-in-rust things.
Simply put:
pub fn pair_matcher(tup: &(String, String)) {
match tup {
&("foo".as_string(), "bar".as_string()) => print!("foobar"),
_ => print!("Unknown"),
}
}
I get the error
-:3:17: 3:18 error: expected `,`, found `.`
-:3 &("foo".as_string(),"bar".as_string())=> { print!("foobar"); }
^
How do you match this?

The left hand side of each branch of a match is not an expression, it is a pattern, which restricts what can go there to basically just literals (plus things like ref which change the binding behaviour); function calls are right out. Given how String works, it’s not possible to get one of them into a pattern (because you can’t construct one statically). It could be achieved with if statements:
if tup == ("foo".to_string(), "bar".to_string()) {
print!("foobar")
} else {
print!("Unknown")
}
… or by taking a slice of the Strings, yielding the type &str which can be constructed literally:
match (tup.0.as_slice(), tup.1.as_slice()) {
("foo", "bar") => print!("foobar"),
_ => print!("Unknown"),
}
Constructing a new String each time is an expensive way of doing things, while using the slices is pretty much free, entailing no allocations.
Note that the .0 and .1 requires #![feature(tuple_indexing)] on the crate; one can do without it thus:
let (ref a, ref b) = tup;
match (a.as_slice(), b.as_slice()) {
("foo", "bar") => print!("foobar"),
_ => print!("Unknown"),
}
Because, you see, the left hand side of a let statement is a pattern as well, and so you can pull the tuple apart with it, taking references to each element, with (ref a, ref b) yielding variables a and b both of type &String.
The Patterns section of the guide goes into some more detail on the subject.

The solution here is that you need to cast types in the other direction:
match (tup.0.as_slice(), tup.1.as_slice()) {
("foo", "bar") => print!("foobar"),
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Equivalent of Cons Pattern from F# in Rust for Strings - rust

Related

How can I (slice) pattern match on an owned Vec with non-Copy elements?

How to test if a string contains each character in a pattern in order?

Why is capitalizing the first letter of a string so convoluted in Rust?

What is the "standard" way to concatenate strings?

Match String Tuple in Rust

Categories

Resources