Converting multiline string to vector of integers [duplicate] - rust

I'm writing on STDIN a string of numbers (e.g 4 10 30 232312) and I want to read that and convert to an array (or a vector) of integers, but I can't find the right way. So far I have:
use std::io;
fn main() {
let mut reader = io::stdin();
let numbers = reader.read_line().unwrap();
}

You can do something like this:
use std::io::{self, BufRead}; // (a)
fn main() {
let reader = io::stdin();
let numbers: Vec<i32> =
reader.lock() // (0)
.lines().next().unwrap().unwrap() // (1)
.split(' ').map(|s| s.trim()) // (2)
.filter(|s| !s.is_empty()) // (3)
.map(|s| s.parse().unwrap()) // (4)
.collect(); // (5)
println!("{:?}", numbers);
}
First, we take a lock of the stdin which lets you work with stdin as a buffered reader. By default, stdin in Rust is unbuffered; you need to call the lock() method to obtain a buffered version of it, but this buffered version is the only one for all threads in your program, hence the access to it should be synchronized.
Next, we read the next line (1); I'm using the lines() iterator whose next() method returns Option<io::Result<String>>, therefore to obtain just String you need to unwrap() twice.
Then we split it by spaces and trim resulting chunks from extra whitespace (2), remove empty chunks which were left after trimming (3), convert strings to i32s (4) and collect the result to a vector (5).
We also need to import std::io::BufRead trait (a) in order to use the lines() method.
If you know in advance that your input won't contain more than one space between numbers, you can omit step (3) and move the trim() call from (2) to (1):
let numbers: Vec<i32> =
reader.lock()
.lines().next().unwrap().unwrap()
.trim().split(' ')
.map(|s| s.parse().unwrap())
.collect();
Rust also provides a method to split a string into a sequence of whitespace-separated words, called split_whitespace():
let numbers: Vec<i32> =
reader.read_line().unwrap().as_slice()
.split_whitespace()
.map(|s| s.parse().unwrap())
.collect()
split_whitespace() is in fact just a combination of split() and filter(), just like in my original example. It uses a split() function argument which checks for different kinds of whitespace, not only space characters.

On Rust 1.5.x, a working solution is:
fn main() {
let mut numbers = String::new();
io::stdin()
.read_line(&mut numbers)
.ok()
.expect("read error");
let numbers: Vec<i32> = numbers
.split_whitespace()
.map(|s| s.parse().expect("parse error"))
.collect();
for num in numbers {
println!("{}", num);
}
}

Safer version. This one skips failed parses so that failed unwrap doesn't panic.
Use read_line for reading single line.
let mut buf = String::new();
// use read_line for reading single line
std::io::stdin().read_to_string(&mut buf).expect("");
// this one skips failed parses so that failed unwrap doesn't panic
let v: Vec<i32> = buf
.split_whitespace() // split string into words by whitespace
.filter_map(|w| w.parse().ok()) // calling ok() turns Result to Option so that filter_map can discard None values
.collect(); // collect items into Vector. This determined by type annotation.
You can even read Vector of Vectors like this.
let stdin = io::stdin();
let locked = stdin.lock();
let vv: Vec<Vec<i32>> = locked.lines()
.filter_map(
|l| l.ok().map(
|s| s.split_whitespace()
.filter_map(|word| word.parse().ok())
.collect()))
.collect();
Above one works for inputs like
2 424 -42 124
42 242 23 22 241
24 12 3 232 445
then turns them it into
[[2, 424, -42, 124],
[42, 242, 23, 22, 241],
[24, 12, 3, 232, 445]]
filter_map accepts a closure that returns Option<T> and filters out all Nones.
ok() turns Result<R,E> to Option<R> so that errors can be filtered in this case.

Safer version from Dulguun Otgon just skips all the errors.
In case when you want to don't skip errors please consider usage of next one method.
fn parse_to_vec<'a, T, It>(it: It) -> Result<Vec<T>, <T as FromStr>::Err>
where
T: FromStr,
It: Iterator<Item = &'a str>,
{
it.map(|v| v.parse::<T>()).fold(Ok(Vec::new()), |vals, v| {
vals.and_then(|mut vals| {
v.and_then(|v| {
vals.push(v);
Ok(vals)
})
})
})
}
while using it you can follow usual panicking way with expect
let numbers = parse_to_vec::<i32, _>(data_str.trim().split(" "))
.expect("can't parse data");
or more smarter way with converting to Result
let numbers = parse_to_vec::<i32, _>(data_str.trim().split(" "))
.map_err(|e| format!("can't parse data: {:?}", e))?;

Related

How to remove characters from specific index in String?

I have an application where I am receiving a string with some repetitive characters. I am receiving input as a String. How to remove the characters from specific index?
main.rs
fn main() {
let s:String = "{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}8668982856274}".to_string();
println!("{}", s);
}
how can I get result
"{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}"
instead of
"{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}}8668982856274}"
String indexing works only with bytes, thus you need to find an index for the appropriate byte slice like this:
let mut s = "{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}8668982856274}";
let closing_bracket_idx = s
.as_bytes()
.iter()
.position(|&x| x == b'}')
.map(|i| i + 1)
.unwrap_or_else(|| s.len());
let v: serde_json::Value = serde_json::from_str(&s[..closing_bracket_idx]).unwrap();
println!("{:?}", v);
However, keep in mind, this approach doesn't really work in general for more complex cases, for example } in a json string value, or nested objects, or a type other than an object at the upmost level (e.g. [1, {2: 3}, 4]). More neat way is using parser capabilities to ignore of the trailing, as an example for serde_json:
let v = serde_json::Deserializer::from_str(s)
.into_iter::<serde_json::Value>()
.next()
.expect("empty input")
.expect("invalid json value");
println!("{:?}", v);

Is there a one-step type conversion from String to Vec<u32>? [duplicate]

I'm writing on STDIN a string of numbers (e.g 4 10 30 232312) and I want to read that and convert to an array (or a vector) of integers, but I can't find the right way. So far I have:
use std::io;
fn main() {
let mut reader = io::stdin();
let numbers = reader.read_line().unwrap();
}
You can do something like this:
use std::io::{self, BufRead}; // (a)
fn main() {
let reader = io::stdin();
let numbers: Vec<i32> =
reader.lock() // (0)
.lines().next().unwrap().unwrap() // (1)
.split(' ').map(|s| s.trim()) // (2)
.filter(|s| !s.is_empty()) // (3)
.map(|s| s.parse().unwrap()) // (4)
.collect(); // (5)
println!("{:?}", numbers);
}
First, we take a lock of the stdin which lets you work with stdin as a buffered reader. By default, stdin in Rust is unbuffered; you need to call the lock() method to obtain a buffered version of it, but this buffered version is the only one for all threads in your program, hence the access to it should be synchronized.
Next, we read the next line (1); I'm using the lines() iterator whose next() method returns Option<io::Result<String>>, therefore to obtain just String you need to unwrap() twice.
Then we split it by spaces and trim resulting chunks from extra whitespace (2), remove empty chunks which were left after trimming (3), convert strings to i32s (4) and collect the result to a vector (5).
We also need to import std::io::BufRead trait (a) in order to use the lines() method.
If you know in advance that your input won't contain more than one space between numbers, you can omit step (3) and move the trim() call from (2) to (1):
let numbers: Vec<i32> =
reader.lock()
.lines().next().unwrap().unwrap()
.trim().split(' ')
.map(|s| s.parse().unwrap())
.collect();
Rust also provides a method to split a string into a sequence of whitespace-separated words, called split_whitespace():
let numbers: Vec<i32> =
reader.read_line().unwrap().as_slice()
.split_whitespace()
.map(|s| s.parse().unwrap())
.collect()
split_whitespace() is in fact just a combination of split() and filter(), just like in my original example. It uses a split() function argument which checks for different kinds of whitespace, not only space characters.
On Rust 1.5.x, a working solution is:
fn main() {
let mut numbers = String::new();
io::stdin()
.read_line(&mut numbers)
.ok()
.expect("read error");
let numbers: Vec<i32> = numbers
.split_whitespace()
.map(|s| s.parse().expect("parse error"))
.collect();
for num in numbers {
println!("{}", num);
}
}
Safer version. This one skips failed parses so that failed unwrap doesn't panic.
Use read_line for reading single line.
let mut buf = String::new();
// use read_line for reading single line
std::io::stdin().read_to_string(&mut buf).expect("");
// this one skips failed parses so that failed unwrap doesn't panic
let v: Vec<i32> = buf
.split_whitespace() // split string into words by whitespace
.filter_map(|w| w.parse().ok()) // calling ok() turns Result to Option so that filter_map can discard None values
.collect(); // collect items into Vector. This determined by type annotation.
You can even read Vector of Vectors like this.
let stdin = io::stdin();
let locked = stdin.lock();
let vv: Vec<Vec<i32>> = locked.lines()
.filter_map(
|l| l.ok().map(
|s| s.split_whitespace()
.filter_map(|word| word.parse().ok())
.collect()))
.collect();
Above one works for inputs like
2 424 -42 124
42 242 23 22 241
24 12 3 232 445
then turns them it into
[[2, 424, -42, 124],
[42, 242, 23, 22, 241],
[24, 12, 3, 232, 445]]
filter_map accepts a closure that returns Option<T> and filters out all Nones.
ok() turns Result<R,E> to Option<R> so that errors can be filtered in this case.
Safer version from Dulguun Otgon just skips all the errors.
In case when you want to don't skip errors please consider usage of next one method.
fn parse_to_vec<'a, T, It>(it: It) -> Result<Vec<T>, <T as FromStr>::Err>
where
T: FromStr,
It: Iterator<Item = &'a str>,
{
it.map(|v| v.parse::<T>()).fold(Ok(Vec::new()), |vals, v| {
vals.and_then(|mut vals| {
v.and_then(|v| {
vals.push(v);
Ok(vals)
})
})
})
}
while using it you can follow usual panicking way with expect
let numbers = parse_to_vec::<i32, _>(data_str.trim().split(" "))
.expect("can't parse data");
or more smarter way with converting to Result
let numbers = parse_to_vec::<i32, _>(data_str.trim().split(" "))
.map_err(|e| format!("can't parse data: {:?}", e))?;

Rust not properly reading integer input

I'm trying to test out my Rust skills with a simple program that reads multiple integers from a single line of input. It compiles correctly, but unfortunately when it receives the input of 1 2 3, it panics, saying that the input wasn't a valid integer. Can someone please explain the reason for this, and also provide an explanation as to how I can fix my program?
use std::io;
fn main() {
let mut string = String::new();
io::stdin().read_line(&mut string);
let int_vec: Vec<u32> = string.split(" ")
.map(|x| x.parse::<u32>().expect("Not an integer!"))
.collect();
for i in (0..int_vec.len()).rev() {
print!("{} ", int_vec[i]);
}
}
In addition to Dogberts answer... it might be helpful to see how you might be able to debug this sort of issue with an iterator yourself in future.
The Iterator trait exposes an inspect function that you can use to inspect each item. Converting your code to use inspect both before and after each map results in:
let int_vec: Vec<u32> = string.split(" ")
.inspect(|x| println!("About to parse: {:?}", x))
.map(|x| {
x.parse::<u32>()
.expect("Not an integer!")
})
.inspect(|x| println!("Parsed {:?} successfully!", x))
.collect();
Outputs:
1 2 3
About to parse: "1"
Parsed 1 successfully!
About to parse: "2"
Parsed 2 successfully!
About to parse: "3\n"
thread '<main>' panicked at 'Not an integer!...
Notice what its attempting to parse when it gets to the number 3.
Of course, you can inspect string all by itself. inspect is handy though for when iterators are involved.
This is because io::stdin().read_line(&mut String) also adds a trailing newline character to the string, which causes the last str after splitting with " " to be "123\n", which is not a valid integer. You can use str::trim() for this:
use std::io;
fn main() {
let mut string = String::new();
io::stdin().read_line(&mut string);
let int_vec: Vec<u32> = string.trim()
.split(" ")
.map(|x| {
x.parse::<u32>()
.expect("Not an integer!")
})
.collect();
for i in (0..int_vec.len()).rev() {
print!("{} ", int_vec[i]);
}
}
With this change, the program works:
$ ./a
1 2 3
3 2 1
Also, you can simplify your for loop:
for i in int_vec.iter().rev() {
print!("{} ", i);
}
You ran into the old problem of the terminating line-ending. Let's try putting
println!("{:?}", string);
in the third line of your code. For the input 1 2 3 it will print (on Windows):
"1 2 3\r\n"
So at some point you are trying to parse "3\r\n" as integer, which obviously fails. One easy way to remove trailing and leading whitespace from a string is to use trim(). This works:
let int_vec: Vec<_> = string.trim().split(" ")
.map(|x| x.parse::<u32>().expect("Not an integer!"))
.collect();

Read an arbitrary number of bytes from type implementing Read

I have something that is Read; currently it's a File. I want to read a number of bytes from it that is only known at runtime (length prefix in a binary data structure).
So I tried this:
let mut vec = Vec::with_capacity(length);
let count = file.read(vec.as_mut_slice()).unwrap();
but count is zero because vec.as_mut_slice().len() is zero as well.
[0u8;length] of course doesn't work because the size must be known at compile time.
I wanted to do
let mut vec = Vec::with_capacity(length);
let count = file.take(length).read_to_end(vec).unwrap();
but take's receiver parameter is a T and I only have &mut T (and I'm not really sure why it's needed anyway).
I guess I can replace File with BufReader and dance around with fill_buf and consume which sounds complicated enough but I still wonder: Have I overlooked something?
Like the Iterator adaptors, the IO adaptors take self by value to be as efficient as possible. Also like the Iterator adaptors, a mutable reference to a Read is also a Read.
To solve your problem, you just need Read::by_ref:
use std::io::Read;
use std::fs::File;
fn main() {
let mut file = File::open("/etc/hosts").unwrap();
let length = 5;
let mut vec = Vec::with_capacity(length);
file.by_ref().take(length as u64).read_to_end(&mut vec).unwrap();
let mut the_rest = Vec::new();
file.read_to_end(&mut the_rest).unwrap();
}
1. Fill-this-vector version
Your first solution is close to work. You identified the problem but did not try to solve it! The problem is that whatever the capacity of the vector, it is still empty (vec.len() == 0). Instead, you could actually fill it with empty elements, such as:
let mut vec = vec![0u8; length];
The following full code works:
#![feature(convert)] // needed for `as_mut_slice()` as of 2015-07-19
use std::fs::File;
use std::io::Read;
fn main() {
let mut file = File::open("/usr/share/dict/words").unwrap();
let length: usize = 100;
let mut vec = vec![0u8; length];
let count = file.read(vec.as_mut_slice()).unwrap();
println!("read {} bytes.", count);
println!("vec = {:?}", vec);
}
Of course, you still have to check whether count == length, and read more data into the buffer if that's not the case.
2. Iterator version
Your second solution is better because you won't have to check how many bytes have been read, and you won't have to re-read in case count != length. You need to use the bytes() function on the Read trait (implemented by File). This transform the file into a stream (i.e an iterator). Because errors can still happen, you don't get an Iterator<Item=u8> but an Iterator<Item=Result<u8, R::Err>>. Hence you need to deal with failures explicitly within the iterator. We're going to use unwrap() here for simplicity:
use std::fs::File;
use std::io::Read;
fn main() {
let file = File::open("/usr/share/dict/words").unwrap();
let length: usize = 100;
let vec: Vec<u8> = file
.bytes()
.take(length)
.map(|r: Result<u8, _>| r.unwrap()) // or deal explicitly with failure!
.collect();
println!("vec = {:?}", vec);
}
You can always use a bit of unsafe to create a vector of uninitialized memory. It is perfectly safe to do with primitive types:
let mut v: Vec<u8> = Vec::with_capacity(length);
unsafe { v.set_len(length); }
let count = file.read(vec.as_mut_slice()).unwrap();
This way, vec.len() will be set to its capacity, and all bytes in it will be uninitialized (likely zeros, but possibly some garbage). This way you can avoid zeroing the memory, which is pretty safe for primitive types.
Note that read() method on Read is not guaranteed to fill the whole slice. It is possible for it to return with number of bytes less than the slice length. There are several RFCs on adding methods to fill this gap, for example, this one.

How do I get the first character out of a string?

I want to get the first character of a std::str. The method char_at() is currently unstable, as is String::slice_chars.
I have come up with the following, but it seems excessive to get a single character and not use the rest of the vector:
let text = "hello world!";
let char_vec: Vec<char> = text.chars().collect();
let ch = char_vec[0];
UTF-8 does not define what "character" is so it depends on what you want. In this case, chars are Unicode scalar values, and so the first char of a &str is going to be between one and four bytes.
If you want just the first char, then don't collect into a Vec<char>, just use the iterator:
let text = "hello world!";
let ch = text.chars().next().unwrap();
Alternatively, you can use the iterator's nth method:
let ch = text.chars().nth(0).unwrap();
Bear in mind that elements preceding the index passed to nth will be consumed from the iterator.
I wrote a function that returns the head of a &str and the rest:
fn car_cdr(s: &str) -> (&str, &str) {
for i in 1..5 {
let r = s.get(0..i);
match r {
Some(x) => return (x, &s[i..]),
None => (),
}
}
(&s[0..0], s)
}
Use it like this:
let (first_char, remainder) = car_cdr("test");
println!("first char: {}\nremainder: {}", first_char, remainder);
The output looks like:
first char: t
remainder: est
It works fine with chars that are more than 1 byte.
Get the first single character out of a string w/o using the rest of that string:
let text = "hello world!";
let ch = text.chars().take(1).last().unwrap();
It would be nice to have something similar to Haskell's head function and tail function for such cases.
I wrote this function to act like head and tail together (doesn't match exact implementation)
pub fn head_tail<T: Iterator, O: FromIterator<<T>::Item>>(iter: &mut T) -> (Option<<T>::Item>, O) {
(iter.next(), iter.collect::<O>())
}
Usage:
// works with Vec<i32>
let mut val = vec![1, 2, 3].into_iter();
println!("{:?}", head_tail::<_, Vec<i32>>(&mut val));
// works with chars in two ways
let mut val = "thanks! bedroom builds YT".chars();
println!("{:?}", head_tail::<_, String>(&mut val));
// calling the function with Vec<char>
let mut val = "thanks! bedroom builds YT".chars();
println!("{:?}", head_tail::<_, Vec<char>>(&mut val));
NOTE: The head_tail function doesn't panic! if the iterator is empty. If this matched Haskell's head/tail output, this would have thrown an exception if the iterator was empty. It might also be good to use iterable trait to be more compatible to other types.
If you only want to test for it, you can use starts_with():
"rust".starts_with('r')
"rust".starts_with(|c| c == 'r')
I think it is pretty straight forward
let text = "hello world!";
let c: char = text.chars().next().unwrap();
next() takes the next item from the iterator
To “unwrap” something in Rust is to say, “Give me the result of the computation, and if there was an error, panic and stop the program.”
The accepted answer is a bit ugly!
let text = "hello world!";
let ch = &text[0..1]; // this returns "h"

Resources