How to extract a value from a set of Strings? - rust

I have a set of strings where I am getting using lines() function. The Strings are like
abcdjf hfdf
test oinf=ddfn
cbdfk test12345=my value
mngf jdk
I want to get my value from the above strings. So, I am using the code as
body.lines()
.filter(|s| s.contains("test12345="))
.map(|x| x.split("=")[1]).to_string();
But it's not working and not returning any value. What is the correct code for this?

First of all, you cannot call to_string on an iterator. Second split returns an iterator as well, so you cannot index it (i.e. [1]), instead you'd need to call nth(1).
body
.lines()
.filter(|s| s.contains("test12345="))
.map(|x| x.split("=").nth(1))
In case there can be multiple = after the first one, which you want to retain in the value, then instead use splitn(2, "="), i.e.:
.map(|x| x.splitn(2, "=").nth(1))
Also, given your filter then everything is needlessly wrapped in Some(..). To avoid that, you can combine the filter and map using filter_map.
body
.lines()
.filter_map(|s| {
if s.contains("test12345=") {
s.splitn(2, "=").nth(1)
} else {
None
}
});
Since you attempted to use to_string. Then if you do want the iterator to return String instead of &str then you can add .map(ToString::to_string) either after nth(2) or after filter_map(..).

Iterator::map() returns an interator, not a value, so you can't use to_string() on it. On the other hand, String::split() does not return a slice, but an iterator, so you can't access the value like [1]; instead, you must access it with the iterator API. As far as Rust can know, there could be multiple lines that contain "test12345=", so it must deal with that. To do so, you would need to .collect() your results in a Vec<String>:
let values: Vec<String> = body.lines()
.filter(|s| s.contains("test12345="))
.map(|x| x.split("=").nth(1).unwrap().to_string())
.collect();
Now, that doens't look nice nor idiomatic, does it?. Since the .filter().map() is a common pattern, there's .filter_map() that accomplishes both in a single function. It's quite handy that it expects that the closure to return Option<T>, so you could use ? for early returns if needed.
let values: Vec<String> = body.lines()
.filter_map(|line| {
if !line.contains("test12345=") {
return None;
}
line.split("=").nth(1).map(String::from)
})
.collect();
Iterator::nth() will give you the nth element on the iterator, but it could not exist, that's why it returns an Option. By using Option::map() you can convert from &str to String if there's a value. In this case by passing the String::from function as the argument to .map() it will convert from Option<&str> to Option<String> which matches the return type of the closure, so now you'll have what you're looking for

Related

How to succinctly convert an iterator over &str to a collection of String

I am new to Rust, and it seems very awkward to use sequences of functional transformations on strings, because they often return &str.
For example, here is an implementation in which I try to read lines of two words separated by a space, and store them into a container of tuples:
use itertools::Itertools;
fn main() {
let s = std::io::stdin()
.lines()
.map(|l| l.unwrap())
.map(|l| {
l.split(" ")
.collect_tuple()
.map(|(a, b)| (a.to_string(), b.to_string()))
.unwrap()
})
.collect::<Vec<_>>();
println!("{:?}", s);
}
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=7f6d370457cc3254195565f69047018c
Because split returns an iterator to &str objects, whose scope is the lambda used for the map, the only way I saw to return them was to manually convert them back to strings. This seems really awkward.
Is there a better way to implement such a program?
Rust is explicit about allocation. The Strings returned by the lines() iterator don't persist beyond the iterator chain, so you can't just store references into them. Therefore, logically, there needs to be a to_string() (or to_owned, or String::from) somewhere.
But putting it after the tuple creation is a bit awkward, because it requires you to call the function twice. You can turn the result of the split() into owned objects instead. This should work:
.map(|l| {
l.split(" ")
.map(String::from)
.collect_tuple()
.unwrap()
})
.collect::<Vec<(_,_)>>();
Note that now you have to be explicit about the tuple type, though.

How to count HashMap values using a predicate in Rust?

I'm trying this, but doesn't work:
let map = HashMap::new();
map.insert(1, "aaa");
map.insert(2, "bbb");
let a = map.counts_by(|k, v| v.starts_with("a"));
What is the right way?
Anything that iterates over collections in Rust is going to factor through the Iterator API, and unlike in Java where iterators are often implicitly used, it's very common in Rust to explicitly ask for an iterator (with .iter()) and do some work directly on it in a functional style. In your case, there are three things we need to do here.
Get the values of the HashMap. This can be done with the values method, which returns an iterator.
Keep only the ones satisfying a particular predicate. This is a filter operation and will produce another iterator. Note that this does not yet iterate over the hash map; it merely produces another iterator capable of doing so later.
Count the matches, using count.
Putting it all together, we have
map.values().filter(|v| v.starts_with("a")).count()
You should filter an iterator of the HashMap, then count the elements of the iterator:
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "aaa");
map.insert(2, "bbb");
assert_eq!(
map.iter().filter(|(_k, v)| v.starts_with("a")).count(),
1
);
}
Notice that the map also has to be marked as mut in order to insert new elements, and the filter closure destructures into a tuple containing the key and the value, rather than accepting two separate parameters.

Split string in Rust, treating consecutive delimiters as one

How do I split a string in Rust such that contiguous delimiters are collapsed into one? For example:
"1 2 3".splitX(" ")
should yield this Vec: ["1", "2", "3"] (when collected from the Split object, or any other intermediate object there may be). This example is for whitespace but we should be able to extend this for other delimiters too.
I believe we can use .filter() to remove empty items after using .split(), but it would be cleaner if it could be done as part of the original .split() directly. I obviously searched this thoroughly and am surprised I can't find the answer anywhere.
I know for whitespace we already have split_whitespace() and split_ascii_whitespace(), but I am looking for a solution that works for a general delimiter string.
The standard solution is to use split then filter:
let output: Vec<&str> = input
.split(pattern)
.filter(|s| !s.is_empty())
.collect();
This is fast and clear.
You can also use a regular expression to avoid the filter step:
let output: Vec<&str> = regex::Regex::new(" +").unwrap()
.split(input)
.collect();
If it's in a function which will be called several times, you can avoid repeating the Regex compilation with lazy_regex:
let output: Vec<&str> = lazy_regex::regex!(" +")
.split(input)
.collect();
IMO, by far the cleanest way is to write .split(" ").filter(|s| !s.is_empty()). It works for all separators and the intent is obvious from reading the code.
If that's too "ugly", you could perhaps pull it into a trait:
trait SplitNonEmpty {
// you might want to define your own struct for the return type
fn split_non_empty<'a, P>(&self, p: P) where P: Pattern<'a> -> ...;
}
impl SplitNonEmpty for &str {
// ...
}
If it's very important that this function returns a Split, you might need to refactor your code to use traits more; do you really care that it was created by splitting a string, or do you care that you can iterate over it? If so, maybe that function should take a impl IntoIterator<&'a str>?
As stated by others, split and filter or with regex is better here. But there is one pattern which can be used flat_map. Though in this context it doesn't add much value.
fn main() {
let output: Vec<&str> = "1 2 3"
.split(" ")
.flat_map(|x| if !x.is_empty() { Some(x) } else { None })
.collect();
println!("{:#?}", output)
}
You can use this pattern, say, if you want to parse these strings as numbers and ignore error values.
fn main() {
let output: Vec<i32> = "1 2 3"
.split(" ")
.flat_map(|x| x.parse())
.collect();
println!("{:#?}", output)
}
All flat_map cares is closure to return something which implements IntoIterator

How to get a slice from an Iterator?

I started to use clippy as a linter. Sometimes, it shows this warning:
writing `&Vec<_>` instead of `&[_]` involves one more reference and cannot be
used with non-Vec-based slices. Consider changing the type to `&[...]`,
#[warn(ptr_arg)] on by default
I changed the parameter to a slice but this adds boilerplate on the call side. For instance, the code was:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
but now it is:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect::<Vec<_>>();
function(&names);
otherwise, I get the following error:
error: the trait `core::marker::Sized` is not implemented for the type
`[collections::string::String]` [E0277]
So I wonder if there is a way to convert an Iterator to a slice or avoid having to specify the collected type in this specific case.
So I wonder if there is a way to convert an Iterator to a slice
There is not.
An iterator only provides one element at a time, whereas a slice is about getting several elements at a time. This is why you first need to collect all the elements yielded by the Iterator into a contiguous array (Vec) before being able to use a slice.
The first obvious answer is not to worry about the slight overhead, though personally I would prefer placing the type hint next to the variable (I find it more readable):
let names: Vec<_> = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
Another option would be for function to take an Iterator instead (and an iterator of references, at that):
let names = args.arguments.iter().map(|arg| &arg.name);
function(names);
After all, iterators are more general, and you can always "realize" the slice inside the function if you need to.
So I wonder if there is a way to convert an Iterator to a slice
There is. (in applicable cases)
Got here searching "rust iter to slice", for my use-case, there was a solution:
fn main() {
// example struct
#[derive(Debug)]
struct A(u8);
let list = vec![A(5), A(6), A(7)];
// list_ref passed into a function somewhere ...
let list_ref: &[A] = &list;
let mut iter = list_ref.iter();
// consume some ...
let _a5: Option<&A> = iter.next();
// now want to eg. return a slice of the rest
let slice: &[A] = iter.as_slice();
println!("{:?}", slice); // [A(6), A(7)]
}
That said, .as_slice is defined on an iter of an existing slice, so the previous answerer was correct in that if you've got, eg. a map iter, you would need to collect it first (so there is something to slice from).
docs: https://doc.rust-lang.org/std/slice/struct.Iter.html#method.as_slice

Why does cloned() allow this function to compile

I'm starting to learn Rust and I tried to implement a function to reverse a vector of strings. I found a solution but I don't understand why it works.
This works:
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().cloned().rev().collect();
return actual;
}
But this doesn't.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().rev().collect(); // without clone
return actual;
}
Error message
src/main.rs:28:10: 28:16 error: mismatched types:
expected `collections::vec::Vec<&str>`,
found `collections::vec::Vec<&&str>`
(expected str,
found &-ptr) [E0308]
Can someone explain to me why? What happens in the second function? Thanks!
So the call to .cloned() is essentially like doing .map(|i| i.clone()) in the same position (i.e. you can replace the former with the latter).
The thing is that when you call iter(), you're iterating/operating on references to the items being iterated. Notice that the vector already consists of 'references', specifically string slices.
So to zoom in a bit, let's replace cloned() with the equivalent map() that I mentioned above, for pedagogical purposes, since they are equivalent. This is what it actually looks like:
.map(|i: & &str| i.clone())
So notice that that's a reference to a reference (slice), because like I said, iter() operates on references to the items, not the items themselves. So since a single element in the vector being iterated is of type &str, then we're actually getting a reference to that, i.e. & &str. By calling clone() on each of these items, we go from a & &str to a &str, just like calling .clone() on a &i64 would result in an i64.
So to bring everything together, iter() iterates over references to the elements. So if you create a new vector from the collected items yielded by the iterator you construct (which you constructed by calling iter()) you would get a vector of references to references, that is:
let actual: Vec<& &str> = strings.iter().rev().collect();
So first of all realize that this is not the same as the type you're saying the function returns, Vec<&str>. More fundamentally, however, the lifetimes of these references would be local to the function, so even if you changed the return type to Vec<& &str> you would get a lifetime error.
Something else you could do, however, is to use the into_iter() method. This method actually does iterate over each element, not a reference to it. However, this means that the elements are moved from the original iterator/container. This is only possible in your situation because you're passing the vector by value, so you're allowed to move elements out of it.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.into_iter().rev().collect();
return actual;
}
playpen
This probably makes a bit more sense than cloning, since we are passed the vector by value, we're allowed to do anything with the elements, including moving them to a different location (in this case the new, reversed vector). And even if we don't, the vector will be dropped at the end of that function anyways, so we might as well. Cloning would be more appropriate if we're not allowed to do that (e.g. if we were passed the vector by reference, or a slice instead of a vector more likely).

Resources