Read lines in Rust with match result - rust

I'm pretty new with Rust and trying to implement some basic stuff.
So one of the samples from the doc is about reading lines from a text file: https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html
The sample code works (obviously...) but I wanted to modify it to add some error handling but the compiler complains about the result I define
// the result after -> is not valid...
pub fn read_lines<P>(filename: P) -> std::result::Result<std::io::Lines<std::io::BufReader<std::fs::File>>, std::io::Error>
where P: AsRef<Path>, {
let file = File::open(filename);
let file = match file {
Ok(file) => io::BufReader::new(file).lines(),
Err(error) => panic!("Problem opening the file: {:?}", error),
};
}
// this is fine
pub fn read_lines2<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
I've tried different suggestions from the intellisense but no luck.
Any idea about how to define the result when there is an Ok/Err ?

If I understand the intent of your code correctly, here is a more canonical version:
use std::fs::File;
use std::io::{self, prelude::*};
use std::path::Path;
pub fn read_lines(filename: &Path) -> io::Lines<io::BufReader<File>> {
let file = File::open(filename).expect("Problem opening the file");
io::BufReader::new(file).lines()
}
If you want the caller to handle any errors, return something like io::Result<T> (an alias for std::result::Result<T, io::Error>), indicating a possible failure. However, if you decide to handle the error inside the function using something like panic!() or Result::expect(), then there is no need to return a Result to the caller since the function only returns to the caller if no error occurred.

Is this what you were looking for?
pub fn read_lines<P>(filename: P) -> std::result::Result<std::io::Lines<std::io::BufReader<std::fs::File>>, std::io::Error>
where P: AsRef<Path>, {
let file = File::open(filename);
match file {
Ok(file) => Ok(io::BufReader::new(file).lines()),
Err(error) => panic!("Problem opening the file: {:?}", error),
}
}
I removed the let file = and ; to enable implicit return, and added an Ok() wrapper around the happy path.

Related

Syn read file and add line to function

I'm trying to parse a file with syn, and add a line to the single function in it. However, it seems to not modify the file at all when writing it back out. I'm fairly sure that I don't understand fully proc-macro and am using it wrong.
In my Cargo.toml I define a lib and bin like so:
[lib]
name = "gen"
path = "src/gen.rs"
proc-macro = true
[[bin]]
name = "main"
path = "src/main.rs"
In my gen.rs file, I define a macro to take in the input, get the function and modify it like so:
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro]
pub fn gen(input: TokenStream) -> TokenStream {
let item = syn::parse(input.clone());
match item {
Ok(mut v) => {
let fn_item = match &mut v {
syn::Item::Fn(fn_item) => fn_item,
_ => panic!("expected fn"),
};
fn_item.block.stmts.insert(
0,
syn::parse(quote!(println!("count me in");).into()).unwrap(),
);
use quote::ToTokens;
return v.into_token_stream().into();
}
Err(error) => {
println!("{:?}", error);
return input;
}
};
}
Now in my main.rs file, I read the file, convert it to a TokenStream, and use my macro on it and write out the output to a file:
fn main() {
if let Err(error) = try_main() {
let _ = writeln!(io::stderr(), "{}", error);
process::exit(1);
}
}
fn try_main() -> Result<(), Error> {
let mut args = env::args_os();
let _ = args.next(); // executable name
let filepath = PathBuf::from("./src/file-to-parse.rs");
let code = fs::read_to_string(&filepath).map_err(Error::ReadFile)?;
let syntax = syn::parse_file(&code).map_err({
|error| Error::ParseFile {
error,
filepath,
source_code: code,
}
})?;
let mut token_stream = TokenStream::new();
syntax.to_tokens(&mut token_stream);
let file_contents_updated = gen::gen!(&token_stream);
std::fs::write("./src/file-updated.rs", file_contents_updated.to_string());
Ok(())
}
Running this, my output file looks the same as the input. For reference, my input file looks like:
fn init() {
println!("Hello, world!");
}
Yes, you've been misunderstood what proc macros do.
gen::gen!(&token_stream) will invoke gen!() at compile time with the literal tokens & token_stream. Since that doesn't look very much like a function, syn will fail to parse this, and your code will println!("{:?}", error); return input; (which by the way, is a bad idea for proc macro: parsing failure should abort compilation. Use return err.into_compile_error().into()). So it will return its input, meaning the output will be the same as the input.
You can use syn and quote for general purpose code generation, but you should not use proc macros for that - rather, use them as libraries. That is, gen::gen(token_stream) instead of gen::gen!(&token_stream). You can also not mark it proc_macro and put it in the same crate.

How to do proper error handling inside a map function? [duplicate]

This question already has answers here:
How does the Iterator::collect function work?
(2 answers)
Closed 1 year ago.
i want to read a textfile and convert all lines into int values.
I use this code.
But what i really miss here is a "good" way of error handling.
use std::{
fs::File,
io::{prelude::*, BufReader},
path::Path
};
fn lines_from_file(filename: impl AsRef<Path>) -> Vec<i32> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines()
.map(|l| l.expect("Could not parse line"))
.map(|l:String| l.parse::<i32>().expect("could not parse int"))
.collect()
}
Question: How to do proper error handling ?
Is this above example "good rust code" ?
or should i use something like this :
fn lines_from_file(filename: impl AsRef<Path>) -> Vec<i32> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines()
.map(|l| l.expect("Could not parse line"))
.map(|l:String| match l.parse::<i32>() {
Ok(num) => num,
Err(e) => -1 //Do something here
}).collect()
}
You can actually collect into a Result<T, E>.
See docs
So you could collect into a Result<Vec<i32>, MyCustomErrorType>.
This works when you transform your iterator in an iterator which returns a Result<i32, MyCustomErrorType>. The iteration stops at the first Err you map.
Here's your working code example.
I used the thiserror crate for error handling
use std::{
fs::File,
io::{prelude::*, BufReader},
num::ParseIntError,
path::Path,
};
use thiserror::Error;
#[derive(Error, Debug)]
pub enum LineParseError {
#[error("Failed to read line")]
IoError(#[from] std::io::Error),
#[error("Failed to parse int")]
FailedToParseInt(#[from] ParseIntError),
}
fn lines_from_file(filename: impl AsRef<Path>) -> Result<Vec<i32>, LineParseError> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines().map(|l| Ok(l?.parse()?)).collect()
}
Some small explanation of how the code works by breaking down this line of code:
buf.lines().map(|l| Ok(l?.parse()?)).collect()
Rust infers that we need to collect to a Result<Vec<i32>, LineParseError> because the return type of the function is Result<Vec<i32>, LineParseError>
In the mapping method we write l? this makes the map method return an Err if the l result contains an Err, the #[from] attribute on LineParseError::IoError takes care of the conversion
The .parse()? works the same way: #[from] on LineParseError::FailedToParseInt takes care of the conversion
Last but not least our method must return Ok(...) when the mapping does succeed, this makes the collect into a Result<Vec<i32>, LineParseError> possible.

Reading ZIP file in Rust causes data owned by the current function

I'm new to Rust and am likely have a huge knowledge gap. Basically, I'm hoping to be create a utility function that would except a regular text file or a ZIP file and return a BufRead where the caller can start processing line by line. It is working well for non ZIP files but I am not understanding how to achieve the same for the ZIP files. The ZIP files will only contain a single file within the archive which is why I'm only processing the first file in the ZipArchive.
I'm running into the the following error.
error[E0515]: cannot return value referencing local variable `archive_contents`
--> src/file_reader.rs:30:9
|
27 | let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
| ---------------- `archive_contents` is borrowed here
...
30 | Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub struct FileReader {
pub file_reader: Result<Box<BufRead>, &'static str>,
}
pub fn file_reader(filename: &str) -> Result<Box<BufRead>, &'static str> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => panic!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
),
};
if path.extension() == Some(OsStr::new("zip")) {
// Processing ZIP file.
let mut archive_contents: zip::read::ZipArchive<std::fs::File> =
zip::ZipArchive::new(file).unwrap();
let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
// ERRORS: returns a value referencing data owned by the current function
Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
main.rs
mod file_reader;
use std::io::BufRead;
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
let mut fr = match file_reader::file_reader(&f) {
Ok(fr) => fr,
Err(e) => panic!("Error reading file."),
};
fr.lines().for_each(|l| match l {
Ok(l) => {
println!("{}", l);
}
Err(e) => {
println!("ERROR: Failed to read line:\n {}", e);
}
});
}
}
Any help is greatly appreciated!
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
You have to restructure the code somehow. The issue here is that, well, the archive data is part of the archive. So unlike file, archive_file is not an independent item, it is rather a pointer of sort into the archive itself. Which means the archive needs to live longer than archive_file for this code to be correct.
In a GC'd language this isn't an issue, archive_file has a reference to archive and will keep it alive however long it needs. Not so for Rust.
A simple way to fix this would be to just copy the data out of archive_file and into an owned buffer you can return to the parent. An other option might be to return a wrapper for (archive_contents, item_index), which would delegate the reading (might be somewhat tricky though). Yet another would be to not have file_reader.
Thanks to #Masklinn for the direction! Here's the working solution using their suggestion.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Cursor;
use std::io::Error;
use std::io::Read;
use std::path::Path;
use zip::read::ZipArchive;
pub fn file_reader(filename: &str) -> Result<Box<dyn BufRead>, Error> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(why),
};
if path.extension() == Some(OsStr::new("zip")) {
let mut archive_contents = ZipArchive::new(file)?;
let mut archive_file = archive_contents.by_index(0)?;
// Read the contents of the file into a vec.
let mut data = Vec::new();
archive_file.read_to_end(&mut data)?;
// Wrap vec in a std::io::Cursor.
let cursor = Cursor::new(data);
Ok(Box::new(cursor))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
While the solution you have settled on does work, it has a few disadvantages. One is that when you read from a zip file, you have to read the contents of the file you want to process into memory before proceeding, which might be impractical for a large file. Another is that you have to heap allocate the BufReader in either case.
Another possibly more idiomatic solution is to restructure your code, such that the BufReader does not need to be returned from the function at all - rather, structure your code so that it has a function that opens the file, which in turn calls a function that processes the file:
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub fn process_file(filename: &str) -> Result<usize, String> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(format!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
)),
};
if path.extension() == Some(OsStr::new("zip")) {
// Handling a zip file
let mut archive_contents=zip::ZipArchive::new(file).unwrap();
let mut buf_reader = BufReader::with_capacity(128 * 1024,archive_contents.by_index(0).unwrap());
process_reader(&mut buf_reader)
} else {
// Handling a plain file.
process_reader(&mut BufReader::with_capacity(128 * 1024, file))
}
}
pub fn process_reader(reader: &mut dyn BufRead) -> Result<usize, String> {
// Example, just count the number of lines
return Ok(reader.lines().count());
}
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
match process_file(&f) {
Ok(count) => println!("File {} Count: {}", &f, count),
Err(e) => println!("Error reading file: {}", e),
};
}
}
This way, you don't need any Boxes and you don't need to read the file into memory before processing it.
A drawback to this solution would if you had multiple functions that need to be able to read from zip files. One way to handle that would be to define process_file to take a callback function to do the processing. First you would change the definition of process_file to be:
pub fn process_file<C>(filename: &str, process_reader: C) -> Result<usize, String>
where C: FnOnce(&mut dyn BufRead)->Result<usize, String>
The rest of the function body can be left unchanged. Now, process_reader can be passed into the function, like this:
process_file(&f, count_lines)
where count_lines would be the original simple function to count the lines, for instance.
This would also allow you to pass in a closure:
process_file(&f, |reader| Ok(reader.lines().count()))

How to put a type annotation in an iterator's collect statement?

I have this code:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect();
}
fn main() {
let data = load_file();
println!("DATA: {}", data[0]);
}
When I try to compile it, I get this error:
error[E0283]: type annotations required: cannot resolve `_: std::iter::FromIterator<std::string::String>`
--> src/main.rs:6:38
|
6 | file.lines().map(|x| x.unwrap()).collect();
| ^^^^^^^
In fact, if I change the load_file function in this way, the code compiles smoothly:
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
let lines: Vec<String> = file.lines().map(|x| x.unwrap()).collect();
return lines;
}
This solution is not "Rusty" enough because ending a function with a return is not encouraged.
Is there a way to put the type annotation directly into the file.lines().map(|x| x.unwrap()).collect(); statement?
Iterator::collect's signature looks like this:
fn collect<B>(self) -> B
where
B: FromIterator<Self::Item>,
In your case, you need to tell it what B is. To specify the types of a generic function, you use syntax called the turbofish, which looks like func::<T, U, ...>()
Your load_file function should look like this:
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect::<Vec<String>>()
}
You can also allow some type inference to continue by specifying some types as the placeholder _:
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect::<Vec<_>>()
}
In fact your problem was slightly less noticeable. This does not compile (your initial piece of code):
use std::fs::File;
use std::io::{BufRead, BufReader};
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect();
}
fn main() {
let data = load_file();
println!("DATA: {}", data[0]);
}
But this does:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn load_file() -> Vec<String> {
let file = BufReader::new(File::open("foo.txt").unwrap());
file.lines().map(|x| x.unwrap()).collect()
}
fn main() {
let data = load_file();
println!("DATA: {}", data[0]);
}
Can you notice the subtle difference? It's just a semicolon in the last line of load_file().
Type inference in Rust is strong enough not to need an annotation here. Your problem was in that you was ignoring the result of collect()! The semicolon acted like a "barrier" for the type inference, because with it collect()'s return type and load_file()'s return type are not connected. The error message is somewhat misleading, however; it seems that this phase of type checking ran earlier than the check for return types (which would rightly fail because () is not compatible with Vec<String>).

Reading Bytes From a Reader

I'm writing something to process stdin in blocks of bytes, but can't seem to work out a simple way to do it (though I suspect there is one).
fn run() -> int {
// Doesn't compile: types differ
let mut buffer = [0, ..100];
loop {
let block = match stdio::stdin().read(buffer) {
Ok(bytes_read) => buffer.slice_to(bytes_read),
// This captures the Err from the end of the file,
// but also actual errors while reading from stdin.
Err(message) => return 0
};
process(block).unwrap();
}
}
fn process(block: &[u8]) -> Result<(), IoError> {
// do things
}
My questions:
What's the "standard" way to do this? (I've been trying/hoping to use and_then()/or_else())
How can I differentiate between the Err(IoError) from end of the file, and the Err that's actually an error?
The previously accepted answer was outdated (Rust v1.0). EOF is no longer considered an error. You can do it like this:
use std::io::{self, Read};
fn main() {
let mut buffer = [0; 100];
while let Ok(bytes_read) = io::stdin().read(&mut buffer) {
if bytes_read == 0 { break; }
process(&buffer[..bytes_read]).unwrap();
}
}
fn process(block: &[u8]) -> Result<(), io::Error> {
Ok(()) // do things
}
Note that this may not result in the expected behavior: read doesn't have to fill the buffer, but may return with any number of bytes read. In the case of stdin the read implementation returns every time a newline is detected (pressing enter in terminal).
Rust API documentation states that:
Note that end-of-file is considered an error, and can be inspected for
in the error's kind field.
The IoError struct looks like this:
pub struct IoError {
pub kind: IoErrorKind,
pub desc: &'static str,
pub detail: Option<String>,
}
The list is all kinds is at http://doc.rust-lang.org/std/io/enum.IoErrorKind.html
You can match it like this:
match stdio::stdin().read(buffer) {
Ok(_) => println!("ok"),
Err(io::IoError{kind:io::EndOfFile, ..}) => println!("end of file"),
_ => println!("error")
}

Resources