How I could get the first letters from a sentence; example: "Rust is a fast reliable programming language" should return the output riafrpl.
fn main() {
let string: &'static str = "Rust is a fast reliable programming language";
println!("First letters: {}", string);
}
let initials: String = string
.split(" ") // create an iterator, yielding words
.flat_map(|s| s.chars().nth(0)) // get the first char of each word
.collect(); // collect the result into a String
This is a perfect task for Rust's iterators; here's how I would do it:
fn main() {
let string: &'static str = "Rust is a fast reliable programming language";
let first_letters = string
.split_whitespace() // split string into words
.map(|word| word // map every word with the following:
.chars() // split it into separate characters
.next() // pick the first character
.unwrap() // take the character out of the Option wrap
)
.collect::<String>(); // collect the characters into a string
println!("First letters: {}", first_letters); // First letters: Riafrpl
}
Related
The following code prints out:
1
a
2
a
I don't get this. why does this happen?
fn main() {
let s = "a ";
let sv1:Vec<&str> = s.split_whitespace().collect();
println!("{}", sv1.len());
for x in sv1.iter() {
println!("{}", x);
}
let sv2:Vec<&str> = s.split(' ').collect();
println!("{}", sv2.len());
for x in sv2.iter() {
println!("{}", x);
}
}
As per the method docs, split_whitespace returns std::str::SplitWhitespace, which is An iterator over the non-whitespace substrings of a string, separated by any amount of whitespace. which means it can split on multiple whitespaces, and doesn't include empty string in the result.
While for split method,
Contiguous separators are separated by the empty string. as well as Separators at the start or end of a string are neighbored by empty strings.
So in your example, split_whitespace gives ["a"] but split gives ["a", ""].
I'm wondering how I can remove the first and last character of a string in Rust.
Example:
Input:
"Hello World"
Output:
"ello Worl"
You can use the .chars() iterator and ignore the first and last characters:
fn rem_first_and_last(value: &str) -> &str {
let mut chars = value.chars();
chars.next();
chars.next_back();
chars.as_str()
}
It returns an empty string for zero- or one-sized strings and it will handle multi-byte unicode characters correctly. See it working on the playground.
…or the trivial solution
also works with Unicode characters:
let mut s = "Hello World".to_string();
s.pop(); // remove last
if s.len() > 0 {
s.remove(0); // remove first
}
I did this by using string slices.
fn main() {
let string: &str = "Hello World";
let first_last_off: &str = &string[1..string.len() - 1];
println!("{}", first_last_off);
}
I took all the characters in the string until the end of the string - 1.
You can also use split_at().
let msg = "Hello, world!";
let msg = msg.split_at(msg.len() - 1);
let msg = msg.0.split_at(1);
println!("{}", msg.1);
ello, world
split_at() returns the following: (&str, &str).
I am attempting to write a lexer for fun, however something keeps bothering me.
let mut chars: Vec<char> = Vec::new();
let mut contents = String::new();
let mut tokens: Vec<&String> = Vec::new();
let mut append = String::new();
//--snip--
for _char in chars {
append += &_char.to_string();
append = append.trim().to_string();
if append.contains("print") {
println!("print found at: \n{}", append);
append = "".to_string();
}
}
Any time I want to do something as simple as append a &str to a String I have to convert it using .to_string, String::from(), .to_owned, etc.
Is there something I am doing wrong, so that I don't have to constantly do this, or is this the primary way of appending?
If you're trying to do something with a type, check the documentation. From the documentation for String:
push: "Appends the given char to the end of this String."
push_str: "Appends a given string slice onto the end of this String."
It's important to understand the differences between String and &str, and why different methods accept and return each of them.
A &str or &mut str are usually preferred in function arguments and return types. That's because they are just pointers to data so nothing needs to be copied or moved when they are passed around.
A String is returned when a function needs to do some new allocation, while &str and &mut str are slices into an existing String. Even though &mut str is mutable, you can't mutate it in a way that increases its length because that would require additional allocation.
The trim function is able to return a &str slice because that doesn't involve mutating the original string - a trimmed string is just a substring, which a slice perfectly describes. But sometimes that isn't possible; for example, a function that pads a string with an extra character would have to return a String because it would be allocating new memory.
You can reduce the number of type conversions in your code by choosing different methods:
for c in chars {
append.push(c); // append += &_char.to_string();
append = append.trim().to_string();
if append.contains("print") {
println!("print found at: \n{}", append);
append.clear(); // append = "".to_string();
}
}
There isn't anything like a trim_in_place method for String, so the way you have done it is probably the only way.
I want to get the first character of a std::str. The method char_at() is currently unstable, as is String::slice_chars.
I have come up with the following, but it seems excessive to get a single character and not use the rest of the vector:
let text = "hello world!";
let char_vec: Vec<char> = text.chars().collect();
let ch = char_vec[0];
UTF-8 does not define what "character" is so it depends on what you want. In this case, chars are Unicode scalar values, and so the first char of a &str is going to be between one and four bytes.
If you want just the first char, then don't collect into a Vec<char>, just use the iterator:
let text = "hello world!";
let ch = text.chars().next().unwrap();
Alternatively, you can use the iterator's nth method:
let ch = text.chars().nth(0).unwrap();
Bear in mind that elements preceding the index passed to nth will be consumed from the iterator.
I wrote a function that returns the head of a &str and the rest:
fn car_cdr(s: &str) -> (&str, &str) {
for i in 1..5 {
let r = s.get(0..i);
match r {
Some(x) => return (x, &s[i..]),
None => (),
}
}
(&s[0..0], s)
}
Use it like this:
let (first_char, remainder) = car_cdr("test");
println!("first char: {}\nremainder: {}", first_char, remainder);
The output looks like:
first char: t
remainder: est
It works fine with chars that are more than 1 byte.
Get the first single character out of a string w/o using the rest of that string:
let text = "hello world!";
let ch = text.chars().take(1).last().unwrap();
It would be nice to have something similar to Haskell's head function and tail function for such cases.
I wrote this function to act like head and tail together (doesn't match exact implementation)
pub fn head_tail<T: Iterator, O: FromIterator<<T>::Item>>(iter: &mut T) -> (Option<<T>::Item>, O) {
(iter.next(), iter.collect::<O>())
}
Usage:
// works with Vec<i32>
let mut val = vec![1, 2, 3].into_iter();
println!("{:?}", head_tail::<_, Vec<i32>>(&mut val));
// works with chars in two ways
let mut val = "thanks! bedroom builds YT".chars();
println!("{:?}", head_tail::<_, String>(&mut val));
// calling the function with Vec<char>
let mut val = "thanks! bedroom builds YT".chars();
println!("{:?}", head_tail::<_, Vec<char>>(&mut val));
NOTE: The head_tail function doesn't panic! if the iterator is empty. If this matched Haskell's head/tail output, this would have thrown an exception if the iterator was empty. It might also be good to use iterable trait to be more compatible to other types.
If you only want to test for it, you can use starts_with():
"rust".starts_with('r')
"rust".starts_with(|c| c == 'r')
I think it is pretty straight forward
let text = "hello world!";
let c: char = text.chars().next().unwrap();
next() takes the next item from the iterator
To “unwrap” something in Rust is to say, “Give me the result of the computation, and if there was an error, panic and stop the program.”
The accepted answer is a bit ugly!
let text = "hello world!";
let ch = &text[0..1]; // this returns "h"
From the documentation, it's not clear. In Java you could use the split method like so:
"some string 123 ffd".split("123");
Use split()
let mut split = "some string 123 ffd".split("123");
This gives an iterator, which you can loop over, or collect() into a vector.
for s in split {
println!("{}", s)
}
let vec = split.collect::<Vec<&str>>();
// OR
let vec: Vec<&str> = split.collect();
There are three simple ways:
By separator:
s.split("separator") | s.split('/') | s.split(char::is_numeric)
By whitespace:
s.split_whitespace()
By newlines:
s.lines()
By regex: (using regex crate)
Regex::new(r"\s").unwrap().split("one two three")
The result of each kind is an iterator:
let text = "foo\r\nbar\n\nbaz\n";
let mut lines = text.lines();
assert_eq!(Some("foo"), lines.next());
assert_eq!(Some("bar"), lines.next());
assert_eq!(Some(""), lines.next());
assert_eq!(Some("baz"), lines.next());
assert_eq!(None, lines.next());
There is a special method split for struct String:
fn split<'a, P>(&'a self, pat: P) -> Split<'a, P> where P: Pattern<'a>
Split by char:
let v: Vec<&str> = "Mary had a little lamb".split(' ').collect();
assert_eq!(v, ["Mary", "had", "a", "little", "lamb"]);
Split by string:
let v: Vec<&str> = "lion::tiger::leopard".split("::").collect();
assert_eq!(v, ["lion", "tiger", "leopard"]);
Split by closure:
let v: Vec<&str> = "abc1def2ghi".split(|c: char| c.is_numeric()).collect();
assert_eq!(v, ["abc", "def", "ghi"]);
split returns an Iterator, which you can convert into a Vec using collect: split_line.collect::<Vec<_>>(). Going through an iterator instead of returning a Vec directly has several advantages:
split is lazy. This means that it won't really split the line until you need it. That way it won't waste time splitting the whole string if you only need the first few values: split_line.take(2).collect::<Vec<_>>(), or even if you need only the first value that can be converted to an integer: split_line.filter_map(|x| x.parse::<i32>().ok()).next(). This last example won't waste time attempting to process the "23.0" but will stop processing immediately once it finds the "1".
split makes no assumption on the way you want to store the result. You can use a Vec, but you can also use anything that implements FromIterator<&str>, for example a LinkedList or a VecDeque, or any custom type that implements FromIterator<&str>.
There's also split_whitespace()
fn main() {
let words: Vec<&str> = " foo bar\t\nbaz ".split_whitespace().collect();
println!("{:?}", words);
// ["foo", "bar", "baz"]
}
The OP's question was how to split with a multi-character string and here is a way to get the results of part1 and part2 as Strings instead in a vector.
Here splitted with the non-ASCII character string "☄☃🤔" in place of "123":
let s = "☄☃🤔"; // also works with non-ASCII characters
let mut part1 = "some string ☄☃🤔 ffd".to_string();
let _t;
let part2;
if let Some(idx) = part1.find(s) {
part2 = part1.split_off(idx + s.len());
_t = part1.split_off(idx);
}
else {
part2 = "".to_string();
}
gets: part1 = "some string "
part2 = " ffd"
If "☄☃🤔" not is found part1 contains the untouched original String and part2 is empty.
Here is a nice example in Rosetta Code -
Split a character string based on change of character - of how you can turn a short solution using split_off:
fn main() {
let mut part1 = "gHHH5YY++///\\".to_string();
if let Some(mut last) = part1.chars().next() {
let mut pos = 0;
while let Some(c) = part1.chars().find(|&c| {if c != last {true} else {pos += c.len_utf8(); false}}) {
let part2 = part1.split_off(pos);
print!("{}, ", part1);
part1 = part2;
last = c;
pos = 0;
}
}
println!("{}", part1);
}
into that
Task
Split a (character) string into comma (plus a blank) delimited strings based on a change of character (left to right).
If you are looking for the Python-flavoured split where you tuple-unpack the two ends of the split string, you can do
if let Some((a, b)) = line.split_once(' ') {
// ...
}