How do i split a string twice in rust? - string

i need split String "fooo:3333#baaar:22222"
Firstly by # secondary by :
and result must be <Vec<Vec<&str, i64>>>
for the first step (split by #) i came up with
.split('#').collect::<Vec<&str>>()
but I can't think of a solution for the second step

A Vec<&str, i64> is not a thing, so I assume you meant (&str, i64)
You can create that by splitting first, then mapping over the chunks.
let v = s
.split('#') // split first time
// "map" over the chunks and only take those where
// the conversion to i64 worked
.filter_map(|c| {
// split once returns an `Option<(&str, &str)>`
c.split_once(':')
// so we use `and_then` and return another `Option`
// when the conversion worked (`.ok()` converts the `Result` to an `Option`)
.and_then(|(l, r)| r.parse().ok().map(|r| (l, r)))
})
.collect::<Vec<(&str, i64)>>();
References:
https://doc.rust-lang.org/std/primitive.str.html#method.split
https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.filter_map
https://doc.rust-lang.org/core/primitive.str.html#method.split_once
https://doc.rust-lang.org/core/option/enum.Option.html#method.and_then
https://doc.rust-lang.org/std/primitive.str.html#method.parse
https://doc.rust-lang.org/std/result/enum.Result.html#method.ok

You can call a closure on each element of an iterator returned by the first split and split inside closure and push values to the vector.
let mut out = Vec::new();
s.split('#').for_each(|x| out.push(x.split(":").collect::<Vec<&str>>()));

Related

What is `|slice|`?

Can anyone explain to me how the |slice| in this statement work?
Source (l35): https://gist.github.com/madhavanmalolan/b30b47640449f92ea00e4075d63460a6
let amount = rest_of_data
.get(..8)
.and_then(|slice| slice.try_into().ok())
.map(u64::from_le_bytes)
.unwrap();
This code tries to transform a byte array to a u64.
let amount = rest_of_data // &[u8]
.get(..8) // Option<&[u8]>
.and_then(|slice /* &[u8] */ | slice.try_into().ok()) // Option<[u8; _]>
.map(u64::from_le_bytes) // Option<u64>
.unwrap();
rest_of_data is a reference to a [u8]. get(..8) method tries to get the first eight elements of the rest_of_data, returns an Option<&[u8]> (Some if there are at least eight elements present in the slice and None if there are fewer.)
As for your question;
To transform some bytes to u64 using u64::from_le_bytes, your input needs to be an owned value, not a reference. By calling .and_then(f) on Option<&[u8]> you give the closure f the val in Some(val) in this case it's &[u8]. Closure f takes one argument (named slice in this example) and returns the closure result. Then this returned value is fed to u64::from_le_bytes with .map() method.
Since u64::from_le_bytes takes an owned value, you need to transform the byte slice into an owned type, which is what the closure does with the .try_into() function. It gives you a byte array.

How to extract a value from a set of Strings?

I have a set of strings where I am getting using lines() function. The Strings are like
abcdjf hfdf
test oinf=ddfn
cbdfk test12345=my value
mngf jdk
I want to get my value from the above strings. So, I am using the code as
body.lines()
.filter(|s| s.contains("test12345="))
.map(|x| x.split("=")[1]).to_string();
But it's not working and not returning any value. What is the correct code for this?
First of all, you cannot call to_string on an iterator. Second split returns an iterator as well, so you cannot index it (i.e. [1]), instead you'd need to call nth(1).
body
.lines()
.filter(|s| s.contains("test12345="))
.map(|x| x.split("=").nth(1))
In case there can be multiple = after the first one, which you want to retain in the value, then instead use splitn(2, "="), i.e.:
.map(|x| x.splitn(2, "=").nth(1))
Also, given your filter then everything is needlessly wrapped in Some(..). To avoid that, you can combine the filter and map using filter_map.
body
.lines()
.filter_map(|s| {
if s.contains("test12345=") {
s.splitn(2, "=").nth(1)
} else {
None
}
});
Since you attempted to use to_string. Then if you do want the iterator to return String instead of &str then you can add .map(ToString::to_string) either after nth(2) or after filter_map(..).
Iterator::map() returns an interator, not a value, so you can't use to_string() on it. On the other hand, String::split() does not return a slice, but an iterator, so you can't access the value like [1]; instead, you must access it with the iterator API. As far as Rust can know, there could be multiple lines that contain "test12345=", so it must deal with that. To do so, you would need to .collect() your results in a Vec<String>:
let values: Vec<String> = body.lines()
.filter(|s| s.contains("test12345="))
.map(|x| x.split("=").nth(1).unwrap().to_string())
.collect();
Now, that doens't look nice nor idiomatic, does it?. Since the .filter().map() is a common pattern, there's .filter_map() that accomplishes both in a single function. It's quite handy that it expects that the closure to return Option<T>, so you could use ? for early returns if needed.
let values: Vec<String> = body.lines()
.filter_map(|line| {
if !line.contains("test12345=") {
return None;
}
line.split("=").nth(1).map(String::from)
})
.collect();
Iterator::nth() will give you the nth element on the iterator, but it could not exist, that's why it returns an Option. By using Option::map() you can convert from &str to String if there's a value. In this case by passing the String::from function as the argument to .map() it will convert from Option<&str> to Option<String> which matches the return type of the closure, so now you'll have what you're looking for

How can I iterate over a delimited string, accumulating state from previous iterations without explicitly tracking the state?

I want to produce an iterator over a delimited string such that each substring separated by the delimiter is returned on each iteration with the substring from the previous iteration, including the delimiter.
For example, given the string "ab:cde:fg", the iterator should return the following:
"ab"
"ab:cde"
"ab:cde:fg"
Simple Solution
A simple solution is to just iterate over collection returned from splitting on the delimiter, keeping track of the previous path:
let mut state = String::new();
for part in "ab:cde:fg".split(':') {
if !state.is_empty() {
state.push_str(":");
}
state.push_str(part);
dbg!(&state);
}
The downside here is the need to explicitly keep track of the state with an extra mutable variable.
Using scan
I thought scan could be used to hide the state:
"ab:cde:fg"
.split(":")
.scan(String::new(), |state, x| {
if !state.is_empty() {
state.push_str(":");
}
state.push_str(x);
Some(&state)
})
.for_each(|x| { dbg!(x); });
However, this fails with the error:
cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
What is the problem with the scan version and how can it be fixed?
Why even build a new string?
You can get the indices of the : and use slices to the original string.
fn main() {
let test = "ab:cde:fg";
let strings = test
.match_indices(":") // get the positions of the `:`
.map(|(i, _)| &test[0..i]) // get the string to that position
.chain(std::iter::once(test)); // let's not forget about the entire string
for substring in strings {
println!("{:?}", substring);
}
}
(Permalink to the playground)
First of all, let us cheat and get your code to compile, so that we can inspect the issue at hand. We can do so by cloning the state. Also, let's add some debug message:
fn main() -> () {
"ab:cde:fg"
.split(":")
.scan(String::new(), |state, x| { // (1)
if !state.is_empty() {
state.push_str(":");
}
state.push_str(x);
eprintln!(">>> scan with {} {}", state, x);
Some(state.clone())
})
.for_each(|x| { // (2)
dbg!(x);
});
}
This results in the following output:
scan with ab ab
[src/main.rs:13] x = "ab"
scan with ab:cde cde
[src/main.rs:13] x = "ab:cde"
scan with ab:cde:fg fg
[src/main.rs:13] x = "ab:cde:fg"
Note how the eprintln! and dbg! outputs are interleaved? That's the result of Iterator's laziness. However, in practice, this means that our intermediate String is borrowed twice:
in the anonymous function |state, x| in state (1)
in the anonymous function |x| in, well, x (2)
However, this would lead to duplicate borrows, even though at least one of them is mutable. The mutable borrow therefore enforces the lifetime of our String to be bound to the anonymous function, whereas the latter function still needs an alive String. Even if we somehow managed to annotate lifetimes, we would just end up with an invalid borrow in (2), as the value is still borrowed as mutable.
The easy way out is a clone. The smarter way out uses match_indices and string slices.

How to get a slice from an Iterator?

I started to use clippy as a linter. Sometimes, it shows this warning:
writing `&Vec<_>` instead of `&[_]` involves one more reference and cannot be
used with non-Vec-based slices. Consider changing the type to `&[...]`,
#[warn(ptr_arg)] on by default
I changed the parameter to a slice but this adds boilerplate on the call side. For instance, the code was:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
but now it is:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect::<Vec<_>>();
function(&names);
otherwise, I get the following error:
error: the trait `core::marker::Sized` is not implemented for the type
`[collections::string::String]` [E0277]
So I wonder if there is a way to convert an Iterator to a slice or avoid having to specify the collected type in this specific case.
So I wonder if there is a way to convert an Iterator to a slice
There is not.
An iterator only provides one element at a time, whereas a slice is about getting several elements at a time. This is why you first need to collect all the elements yielded by the Iterator into a contiguous array (Vec) before being able to use a slice.
The first obvious answer is not to worry about the slight overhead, though personally I would prefer placing the type hint next to the variable (I find it more readable):
let names: Vec<_> = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
Another option would be for function to take an Iterator instead (and an iterator of references, at that):
let names = args.arguments.iter().map(|arg| &arg.name);
function(names);
After all, iterators are more general, and you can always "realize" the slice inside the function if you need to.
So I wonder if there is a way to convert an Iterator to a slice
There is. (in applicable cases)
Got here searching "rust iter to slice", for my use-case, there was a solution:
fn main() {
// example struct
#[derive(Debug)]
struct A(u8);
let list = vec![A(5), A(6), A(7)];
// list_ref passed into a function somewhere ...
let list_ref: &[A] = &list;
let mut iter = list_ref.iter();
// consume some ...
let _a5: Option<&A> = iter.next();
// now want to eg. return a slice of the rest
let slice: &[A] = iter.as_slice();
println!("{:?}", slice); // [A(6), A(7)]
}
That said, .as_slice is defined on an iter of an existing slice, so the previous answerer was correct in that if you've got, eg. a map iter, you would need to collect it first (so there is something to slice from).
docs: https://doc.rust-lang.org/std/slice/struct.Iter.html#method.as_slice

String append, cannot move out of dereference of '&'pointer

I'm having trouble combining two strings, I'm very new to rust so If there is an easier way to do this please feel free to show me.
My function loops through a vector of string tuples (String,String), what I want to do is be able to combine these two strings elements into one string. Here's what I have:
for tup in bmp.bitmap_picture.mut_iter() {
let &(ref x, ref y) = tup;
let res_string = x;
res_string.append(y.as_slice());
}
but I receive the error : error: cannot move out of dereference of '&'-pointer for the line: res_string.append(y.as_slice());
I also tried res_string.append(y.clone().as_slice()); but the exact same error happened, so I'm not sure if that was even right to do.
The function definition of append is:
fn append(self, second: &str) -> String
The plain self indicates by-value semantics. By-value moves the receiver into the method, unless the receiver implements Copy (which String does not). So you have to clone the x rather than the y.
If you want to move out of a vector, you have to use move_iter.
There are a few other improvements possible as well:
let string_pairs = vec![("Foo".to_string(),"Bar".to_string())];
// Option 1: leave original vector intact
let mut strings = Vec::new();
for &(ref x, ref y) in string_pairs.iter() {
let string = x.clone().append(y.as_slice());
strings.push(string);
}
// Option 2: consume original vector
let strings: Vec<String> = string_pairs.move_iter()
.map(|(x, y)| x.append(y.as_slice()))
.collect();
It seems like you might be confusing append, which takes the receiver by value and returns itself, with push_str, which simply mutates the receiver (passed by mutable reference) as you seem to expect. So the simplest fix to your example is to change append to push_str. You'll also need to change "ref x" to "ref mut x" so it can be mutated.

Resources