This question already has answers here:
How does the Iterator::collect function work?
(2 answers)
Closed 1 year ago.
i want to read a textfile and convert all lines into int values.
I use this code.
But what i really miss here is a "good" way of error handling.
use std::{
fs::File,
io::{prelude::*, BufReader},
path::Path
};
fn lines_from_file(filename: impl AsRef<Path>) -> Vec<i32> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines()
.map(|l| l.expect("Could not parse line"))
.map(|l:String| l.parse::<i32>().expect("could not parse int"))
.collect()
}
Question: How to do proper error handling ?
Is this above example "good rust code" ?
or should i use something like this :
fn lines_from_file(filename: impl AsRef<Path>) -> Vec<i32> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines()
.map(|l| l.expect("Could not parse line"))
.map(|l:String| match l.parse::<i32>() {
Ok(num) => num,
Err(e) => -1 //Do something here
}).collect()
}
You can actually collect into a Result<T, E>.
See docs
So you could collect into a Result<Vec<i32>, MyCustomErrorType>.
This works when you transform your iterator in an iterator which returns a Result<i32, MyCustomErrorType>. The iteration stops at the first Err you map.
Here's your working code example.
I used the thiserror crate for error handling
use std::{
fs::File,
io::{prelude::*, BufReader},
num::ParseIntError,
path::Path,
};
use thiserror::Error;
#[derive(Error, Debug)]
pub enum LineParseError {
#[error("Failed to read line")]
IoError(#[from] std::io::Error),
#[error("Failed to parse int")]
FailedToParseInt(#[from] ParseIntError),
}
fn lines_from_file(filename: impl AsRef<Path>) -> Result<Vec<i32>, LineParseError> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines().map(|l| Ok(l?.parse()?)).collect()
}
Some small explanation of how the code works by breaking down this line of code:
buf.lines().map(|l| Ok(l?.parse()?)).collect()
Rust infers that we need to collect to a Result<Vec<i32>, LineParseError> because the return type of the function is Result<Vec<i32>, LineParseError>
In the mapping method we write l? this makes the map method return an Err if the l result contains an Err, the #[from] attribute on LineParseError::IoError takes care of the conversion
The .parse()? works the same way: #[from] on LineParseError::FailedToParseInt takes care of the conversion
Last but not least our method must return Ok(...) when the mapping does succeed, this makes the collect into a Result<Vec<i32>, LineParseError> possible.
Related
How could I pack the following code into a single iterator?
use std::io::{BufRead, BufReader};
use std::fs::File;
let file = BufReader::new(File::open("sample.txt").expect("Unable to open file"));
for line in file.lines() {
for ch in line.expect("Unable to read line").chars() {
println!("Character: {}", ch);
}
}
Naively, I’d like to have something like (I skipped unwraps)
let lines = file.lines().next();
Reader {
line: lines,
char: next().chars()
}
and iterate over Reader.char till hitting None, then refreshing Reader.line to a new line and Reader.char to the first character of the line. This doesn't seem to be possible though because Reader.char depends on the temporary variable.
Please notice that the question is about nested iterators, reading text files is used as an example.
You can use the flat_map() iterator utility to create new iterator that can produce any number of items for each item in the iterator it's called on.
In this case, that's complicated by the fact that lines() returns an iterator of Results, so the Err case must be handled.
There's also the issue that .chars() references the original string to avoid an additional allocation, so you have to collect the characters into another iterable container.
Solving both issues results in this mess:
fn example() -> impl Iterator<Item=Result<char, std::io::Error>> {
let file = BufReader::new(File::open("sample.txt").expect("Unable to open file"));
file.lines().flat_map(|line| match line {
Err(e) => vec![Err(e)],
Ok(line) => line.chars().map(Ok).collect(),
})
}
If String gave us an into_chars() method we could avoid collect() here, but then we'd have differently-typed iterators and would need to use either Box<dyn Iterator> or something like either::Either.
Since you already use .expect() here, you can simplify a bit by using .expect() within the closure to avoid handling the Err case:
fn example() -> impl Iterator<Item=char> {
let file = BufReader::new(File::open("sample.txt").expect("Unable to open file"));
file.lines().flat_map(|line|
line.expect("Unable to read line").chars().collect::<Vec<_>>()
)
}
In the general case, flat_map() is usually quite easy. You just need to be mindful of whether you are iterating owned vs borrowed values; both cases have some sharp corners. In this case, iterating over owned String values makes using .chars() problematic. If we could iterate over borrowed str slices we wouldn't have to .collect().
Drawing on the answer from #cdhowie and this answer that suggests using IntoIter to get an iterator of owned chars, I was able to come up with this solution that is the closest to what I expected:
use std::fs::File;
use std::io;
use std::io::{BufRead, BufReader, Lines};
use std::vec::IntoIter;
struct Reader {
lines: Lines<BufReader<File>>,
iter: IntoIter<char>,
}
impl Reader {
fn new(filename: &str) -> Self {
let file = BufReader::new(File::open(filename).expect("Unable to open file"));
let mut lines = file.lines();
let iter = Reader::char_iter(lines.next().expect("Unable to read file"));
Reader { lines, iter }
}
fn char_iter(line: io::Result<String>) -> IntoIter<char> {
line.unwrap().chars().collect::<Vec<_>>().into_iter()
}
}
impl Iterator for Reader {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
match self.iter.next() {
None => {
self.iter = match self.lines.next() {
None => return None,
Some(line) => Reader::char_iter(line),
};
Some('\n')
}
Some(val) => Some(val),
}
}
}
it works as expected:
let reader = Reader::new("src/main.rs");
for ch in reader {
print!("{}", ch);
}
I'm trying to read a file into a vector, then print out a random line from that vector.
What am I doing wrong?
I'm asking here because I know I'm making a big conceptual mistake, but I'm having trouble identifying exactly where it is.
I know the error -
error[E0308]: mismatched types
26 | processor(&lines)
| ^^^^^^ expected &str, found struct std::string::String
And I see that there's a mismatch - but I don't know how to give the right type, or refactor the code for that (very short) function.
My code is below:
use std::{
fs::File,
io::{prelude::*, BufReader},
path::Path,
};
fn lines_from_file(filename: impl AsRef<Path>) -> Vec<String> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines()
.map(|l| l.expect("Could not parse line"))
.collect()
}
fn processor(vectr: &Vec<&str>) -> () {
let vec = vectr;
let index = (rand::random::<f32>() * vec.len() as f32).floor() as usize;
println!("{}", vectr[index]);
}
fn main() {
let lines = lines_from_file("./example.txt");
for line in lines {
println!("{:?}", line);
}
processor(&lines);
}
While you're calling the processor function you're trying to pass a Vec<String> which is what the lines_from_file returns but the processor is expecting a &Vec<&str>. You can change the processor to match that expectation:
fn processor(vectr: &Vec<String>) -> () {
let vec = vectr;
let index = (rand::random::<f32>() * vec.len() as f32).floor() as usize;
println!("{}", vectr[index]);
}
The main function:
fn main() {
let lines = lines_from_file("./example.txt");
for line in &lines {. // &lines to avoid moving the variable
println!("{:?}", line);
}
processor(&lines);
}
More generally, a String is not the same as a string slice &str, therefore Vec<String> is not the same as Vec<&str>. I'd recommend checking the rust book: https://doc.rust-lang.org/nightly/book/ch04-03-slices.html?highlight=String#string-slices
I'm trying to figure out build a feature which requires reading the contents of a file into a futures::stream::BoxStream but I'm having a tough time figuring out what I need to do.
I have figured out how to read a file byte by byte via Bytes which implements an iterator.
use std::fs::File;
use std::io::prelude::*;
use std::io::{BufReader, Bytes};
// TODO: Convert this to a async Stream
fn async_read() -> Box<dyn Iterator<Item = Result<u8, std::io::Error>>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
let iter = reader.bytes().into_iter();
Box::new(iter)
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
for b in async_read().into_iter() {
println!("{:?}", b);
}
}
However, I've been struggling a bunch trying to figure out how I can turn this Box<dyn Iterator<Item = Result<u8, std::io::Error>>> into an Stream.
I would have thought something like this would work:
use futures::stream;
use std::fs::File;
use std::io::prelude::*;
use std::io::{BufReader, Bytes};
// TODO: Convert this to a async Stream
fn async_read() -> stream::BoxStream<'static, dyn Iterator<Item = Result<u8, std::io::Error>>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
let iter = reader.bytes().into_iter();
std::pin::Pin::new(Box::new(stream::iter(iter)))
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
while let Some(b) = async_read().poll() {
println!("{:?}", b);
}
}
But I keep getting a ton of compiler errors, I've tried other permutations but generally getting no where.
One of the compiler errors:
std::pin::Pin::new
``` --> src/main.rs:14:24
|
14 | std::pin::Pin::new(Box::new(stream::iter(iter)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected trait object `dyn std::iter::Iterator`, found enum `std::result::Result`
Anyone have any advice?
I'm pretty new to Rust, and specifically Streams/lower level stuff so I apologize if I got anything wrong, feel free to correct me.
For some additional background, I'm trying to do this so you can CTRL-C out of a command in nushell
I think you are overcomplicating it a bit, you can just return impl Stream from async_read, there is no need to box or pin (same goes for the original Iterator-based version). Then you need to set up an async runtime in order to poll the stream (in this example I just use the runtime provided by futures::executor::block_on). Then you can call futures::stream::StreamExt::next() on the stream to get a future representing the next item.
Here is one way to do this:
use futures::prelude::*;
use std::{
fs::File,
io::{prelude::*, BufReader},
};
fn async_read() -> impl Stream<Item = Result<u8, std::io::Error>> {
let f = File::open("/dev/random").expect("Could not open file");
let reader = BufReader::new(f);
stream::iter(reader.bytes())
}
async fn async_main() {
while let Some(b) = async_read().next().await {
println!("{:?}", b);
}
}
fn main() {
ctrlc::set_handler(move || {
println!("received Ctrl+C!");
std::process::exit(0);
})
.expect("Error setting Ctrl-C handler");
futures::executor::block_on(async_main());
}
This is what I have, but I want to avoid using unwrap on my reqwest values:
extern crate base64;
extern crate reqwest;
use serde_json;
use serde_json::json;
pub fn perform_get(id: String) -> serde_json::value::Value {
let client = reqwest::Client::builder().build().unwrap();
let url = String::from("SomeURL");
let res = client.get(&url).send().unwrap().text();
let mut v = json!(null);
match res {
Ok(n) => {
v = serde_json::from_str(&n).unwrap();
}
Err(r) => {
println!("Something wrong happened {:?}", r);
}
}
v
}
fn main() {
println!("Hi there! i want the function above to return a result instead of a Serde value so I can handle the error in main!");
}
Here is a link to a rust playground example
The official Rust book, The Rust Programming Language, is freely available online. It has an entire chapter on using Result, explaining introductory topics such as the Result enum and how to use it.
How to return a Result containing a serde_json::Value?
The same way you return a Result of any type; there's nothing special about Value:
use serde_json::json; // 1.0.38
pub fn ok_example() -> Result<serde_json::value::Value, i32> {
Ok(json! { "success" })
}
pub fn err_example() -> Result<serde_json::value::Value, i32> {
Err(42)
}
If you have a function that returns a Result, you can use the question mark operator (?) to exit early from a function on error, returning the error. This is a concise way to avoid unwrap or expect:
fn use_them() -> Result<(), i32> {
let ok = ok_example()?;
println!("{:?}", ok);
let err = err_example()?;
println!("{:?}", err); // Never executed, we always exit due to the `?`
Ok(()) // Never executed
}
This is just a basic example.
Applied to your MCVE, it would look something like:
use reqwest; // 0.9.10
use serde_json::Value; // 1.0.38
type Error = Box<dyn std::error::Error>;
pub fn perform_get(_id: String) -> Result<Value, Error> {
let client = reqwest::Client::builder().build()?;
let url = String::from("SomeURL");
let res = client.get(&url).send()?.text()?;
let v = serde_json::from_str(&res)?;
Ok(v)
}
Here, I'm using the trait object Box<dyn std::error::Error> to handle any kind of error (great for quick programs and examples). I then sprinkle ? on every method that could fail (i.e. returns a Result) and end the function with an explicit Ok for the final value.
Note that the panic and the never-used null value can be removed with this style.
See also:
What is this question mark operator about?
Rust proper error handling (auto convert from one error type to another with question mark)
Rust return result error from fn
Return value from match to Err(e)
What is the idiomatic way to handle/unwrap nested Result types?
better practice to return a Result
See also:
Should I avoid unwrap in production application?
If you are in the user side I would suggest to use Box<dyn std::error::Error>, this allow to return every type that implement Error, ? will convert the concrete error type to the dynamic boxed trait, this add a little overhead when there is an error but when error are not expected or really rare this is not a big deal.
use reqwest;
use serde_json::value::Value;
use std::error::Error;
fn perform_get(_id: String) -> Result<Value, Box<dyn Error>> {
let client = reqwest::Client::builder().build()?;
let url = String::from("SomeURL");
let res = client.get(&url).send()?.text()?;
let v = serde_json::from_str(&res)?;
Ok(v)
// last two line could be serde_json::from_str(&res).map_err(std::convert::Into::into)
}
fn main() {
println!("{:?}", perform_get("hello".to_string()));
}
This produce the following error:
Err(Error { kind: Url(RelativeUrlWithoutBase), url: None })
The kind smart folks over at Rust Discord helped me solve this one. (user noc)
extern crate base64;
extern crate reqwest;
pub fn get_jira_ticket() -> Result<serde_json::value::Value, reqwest::Error> {
let client = reqwest::Client::builder().build().unwrap();
let url = String::from("SomeURL");
let res = client.get(&url).send().and_then(|mut r| r.json());
res
}
fn main() {
println!("This works");
}
The key part was this in the header for the return
-> Result<serde_json::value::Value, reqwest::Error>
And this here to actually return the data.
client.get(&url).send().and_then(|mut r| r.json());
I have this code:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect();
}
fn main() {
let data = load_file();
println!("DATA: {}", data[0]);
}
When I try to compile it, I get this error:
error[E0283]: type annotations required: cannot resolve `_: std::iter::FromIterator<std::string::String>`
--> src/main.rs:6:38
|
6 | file.lines().map(|x| x.unwrap()).collect();
| ^^^^^^^
In fact, if I change the load_file function in this way, the code compiles smoothly:
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
let lines: Vec<String> = file.lines().map(|x| x.unwrap()).collect();
return lines;
}
This solution is not "Rusty" enough because ending a function with a return is not encouraged.
Is there a way to put the type annotation directly into the file.lines().map(|x| x.unwrap()).collect(); statement?
Iterator::collect's signature looks like this:
fn collect<B>(self) -> B
where
B: FromIterator<Self::Item>,
In your case, you need to tell it what B is. To specify the types of a generic function, you use syntax called the turbofish, which looks like func::<T, U, ...>()
Your load_file function should look like this:
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect::<Vec<String>>()
}
You can also allow some type inference to continue by specifying some types as the placeholder _:
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect::<Vec<_>>()
}
In fact your problem was slightly less noticeable. This does not compile (your initial piece of code):
use std::fs::File;
use std::io::{BufRead, BufReader};
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect();
}
fn main() {
let data = load_file();
println!("DATA: {}", data[0]);
}
But this does:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect()
}
fn main() {
let data = load_file();
println!("DATA: {}", data[0]);
}
Can you notice the subtle difference? It's just a semicolon in the last line of load_file().
Type inference in Rust is strong enough not to need an annotation here. Your problem was in that you was ignoring the result of collect()! The semicolon acted like a "barrier" for the type inference, because with it collect()'s return type and load_file()'s return type are not connected. The error message is somewhat misleading, however; it seems that this phase of type checking ran earlier than the check for return types (which would rightly fail because () is not compatible with Vec<String>).