I am trying to learn Rust. I am following a book online which implements the unix program cat. Right now I trying to read the content of files passed as an argument like that cargo run file1.txt file2.txt but the program panics:
D:\rust\cat> cargo run .\src\test.txt
Compiling cat v0.1.0 (D:\rust\cat)
Finished dev [unoptimized + debuginfo] target(s) in 0.62s
Running `target\debug\cat.exe .\src\test.txt`
thread 'main' panicked at 'Box<Any>', src\main.rs:12:28
this is my program:
use std::env;
use std::fs::File;
use std::io;
use std::io::prelude::*;
fn main() {
let args: Vec<String> = env::args().collect();
if args.len() > 1 {
match read_file(&args) {
Ok(content) => println!("{}", content),
Err(reason) => panic!(reason),
}
}
}
fn read_file(filenames: &Vec<String>) -> Result<String, io::Error> {
let mut content = String::new();
for filename in filenames {
let mut file = File::open(filename)?;
file.read_to_string(&mut content)?;
}
Ok(content)
}
Can anyone explain what I am missing here?
The first element of the Args iterator returned by std::env::args is tipically the path of executable (see the docs
for more details).
The error arises because you do not skip the first arg: the program binary is not a sequence of valid UTF-8 bytes.
The apparently non sense error thread 'main' panicked at 'Box<Any>' is because panic! is not used with the same arguments of
the format! syntax.
use std::env;
use std::fs::File;
use std::io;
use std::io::prelude::*;
fn main() {
for filename in env::args().skip(1) {
match read_file(filename) {
Ok(content) => println!("{}", content),
Err(reason) => panic!("{}", reason),
}
}
}
fn read_file(filename: String) -> Result<String, io::Error> {
let mut content = String::new();
let mut file = File::open(filename)?;
file.read_to_string(&mut content)?;
Ok(content)
}
Related
I am using polars with Rust and I would like to be able to read multiple csv files as input.
I found this section in the documentation that shows how to use glob patterns to read multiple files using Python, but I could not find a way to do this in Rust.
Trying the glob pattern with Rust does not work.
The code I tried was
use polars::prelude::*;
fn main() {
let df = CsvReader::from_path("./example/*.csv").unwrap().finish().unwrap();
println!("{:?}", df);
}
And this failed with the error
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Os { code: 2, kind: NotFound, message: "No such file or directory" })', src/main.rs:26:54
stack backtrace:
0: rust_begin_unwind
I also tried creating the Path independently and confirm the path represents a directory,
use std::path::PathBuf;
use polars::prelude::*;
fn main() {
let path = PathBuf::from("./example");
println!("{}", path.is_dir());
let df = CsvReader::from_path(path).unwrap().finish().unwrap();
println!("{:?}", df);
}
it also fails with the same error.
So question is how do I read multiple CSV/Parquet/JSON etc files from a directory using Rust?
The section of the documentation referenced in your question uses both the library glob and a for loop in python.
Thus, we can write the rust version implementing similar ideas as follows:
eager version
use std::path::PathBuf;
use glob::glob;
use polars::prelude::*;
fn main() {
let csv_files = glob("my-file-path/*csv")
.expect("No CSV files in target directory");
let mut dfs: Vec<PolarsResult<DataFrame>> = Vec::new();
for entry in csv_files {
dfs.push(read_csv(entry.unwrap().to_path_buf()));
}
println!("dfs: {:?}", dfs);
}
fn read_csv(filepath: PathBuf) -> PolarsResult<DataFrame> {
CsvReader::from_path(filepath)?
.has_header(true)
.finish()
}
lazy version
fn read_csv_lazy(filepath: PathBuf) -> PolarsResult<LazyFrame> {
LazyCsvReader::new(filepath).has_header(true).finish()
}
fn main() {
let mut ldfs: Vec<PolarsResult<LazyFrame>> = Vec::new();
for entry in csv_files {
ldfs.push(read_csv_lazy(entry.unwrap().to_path_buf()));
}
// do stuff
for f in ldfs.into_iter() {
println!("{:?}", f.unwrap().collect())
}
}
Hi I want to be able to read a file which contains json lines into a rust app like this
$ cargo run < users.json
and then read those lines as an iterator. As of now I have this code but i don't want the file hard coded but piped into the process as in the line above.
use std::fs::File;
use std::io::{self, prelude::*, BufReader};
fn main() -> io::Result<()> {
let file = File::open("users.json")?;
let reader = BufReader::new(file);
for line in reader.lines() {
println!("{}", line);
}
Ok(())
}
I just solved it this makes the trick
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
println!("{}", line.unwrap());
}
}
cargo help run reveals:
NAME
cargo-run - Run the current package
SYNOPSIS
cargo run [options] [-- args]
DESCRIPTION
Run a binary or example of the local package.
All the arguments following the two dashes (--) are passed to the binary to run. If you're passing arguments to both Cargo and the binary, the ones after -- go to the binary, the ones before go to Cargo.
So to pass arguments the syntax would be:
cargo run -- foo bar baz
You can then access the values like this:
let args: Vec<String> = env::args().collect();
A complete minimal example would be:
use std::env;
fn main() {
let args: Vec<String> = env::args().collect();
dbg!(&args);
}
Running cargo run -- users.json would result in:
$ cargo run -- users.json
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/sandbox users.json`
[src/main.rs:5] &args = [
"target/debug/sandbox",
"users.json",
]
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
println!("{}", line.unwrap());
}
}
use std::fs::File;
use std::io::prelude::*;
use std::io::BufReader;
use std::iter::Iterator;
fn main() -> std::io::Result<()> {
let file = File::open("input")?; // file is input
let mut buf_reader = BufReader::new(file);
let mut contents = String::new();
buf_reader.read_to_string(&mut contents)?;
for i in contents.parse::<i32>() {
let i = i / 2;
println!("{}", i);
}
Ok(())
}
list of numbers:
50951
69212
119076
124303
95335
65069
109778
113786
124821
103423
128775
111918
138158
141455
92800
50908
107279
77352
129442
60097
84670
143682
104335
105729
87948
59542
81481
147508
str::parse::<i32> can only parse a single number at a time, so you will need to split the text first and then parse each number one by one. For example if you have one number per line and no extra whitespace, you can use BufRead::lines to process the text line by line:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() -> std::io::Result<()> {
let file = File::open("input")?; // file is input
let mut buf_reader = BufReader::new(file);
for line in buf_reader.lines() {
let value = line?
.parse::<i32>()
.expect("Not able to parse: Content is malformed !");
println!("{}", value / 2);
}
Ok(())
}
As an extra bonus this avoids reading the whole file into memory, which can be important if the file is big.
For tiny examples like this, I'd read the entire string at once, then split it up on lines.
use std::fs;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = fs::read_to_string("input")?;
for line in contents.trim().lines() {
let i: i32 = line.trim().parse()?;
let i = i / 2;
println!("{}", i);
}
Ok(())
}
See also:
What's the de-facto way of reading and writing files in Rust 1.x?
For tightly-controlled examples like this, I'd ignore errors occurring while parsing:
use std::fs;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = fs::read_to_string("input")?;
for i in contents.trim().lines().flat_map(|l| l.trim().parse::<i32>()) {
let i = i / 2;
println!("{}", i);
}
Ok(())
}
See also:
Why does `Option` support `IntoIterator`?
For fixed-input examples like this, I'd avoid opening the file at runtime at all, pushing the error to compile-time:
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = include_str!("../input");
for i in contents.trim().lines().flat_map(|l| l.trim().parse::<i32>()) {
let i = i / 2;
println!("{}", i);
}
Ok(())
}
See also:
Is there a good way to include external resource data into Rust source code?
If I wanted to handle failures to parse but treat the iterator as if errors were impossible, I'd use Itertools::process_results:
use itertools; // 0.8.2
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = include_str!("../input");
let numbers = contents.trim().lines().map(|l| l.trim().parse::<i32>());
let sum = itertools::process_results(numbers, |i| i.sum::<i32>());
println!("{:?}", sum);
Ok(())
}
See also:
How do I perform iterator computations over iterators of Results without collecting to a temporary vector?
How do I stop iteration and return an error when Iterator::map returns a Result::Err?
I know how to read the command line arguments, but I am having difficulties reading the command output from a pipe.
Connect a program (A) that outputs data to my Rust program using a pipe:
A | R
The program should consume the data line by line as they come.
$ pwd | cargo run should print the pwd output.
OR
$ find . | cargo run should output the find command output which is more than 1 line.
Use BufRead::lines on a locked handle to standard input:
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
let line = line.expect("Could not read line from standard in");
println!("{}", line);
}
}
If you wanted to reuse the allocation of the String, you could use the loop form:
use std::io::{self, Read};
fn main() {
let stdin = io::stdin();
let mut stdin = stdin.lock(); // locking is optional
let mut line = String::new();
// Could also `match` on the `Result` if you wanted to handle `Err`
while let Ok(n_bytes) = stdin.read_to_string(&mut line) {
if n_bytes == 0 { break }
println!("{}", line);
line.clear();
}
}
You just need to read from Stdin.
This is based on an example taken from the documentation:
use std::io;
fn main() {
loop {
let mut input = String::new();
match io::stdin().read_line(&mut input) {
Ok(len) => if len == 0 {
return;
} else {
println!("{}", input);
}
Err(error) => {
eprintln!("error: {}", error);
return;
}
}
}
}
It's mostly the docs example wrapped in a loop, breaking out of the loop when there is no more input, or if there is an error.
The other changes is that it's better in your context to write errors to stderr, which is why the error branch uses eprintln!, instead of println!. This macro probably wasn't available when that documentation was written.
use std::io;
fn main() {
loop {
let mut input = String::new();
io::stdin()
.read_line(&mut input)
.expect("failed to read from pipe");
input = input.trim().to_string();
if input == "" {
break;
}
println!("Pipe output: {}", input);
}
}
OUTPUT:
[18:50:29 Abhinickz#wsl -> pipe$ pwd
/mnt/d/Abhinickz/dev_work/learn_rust/pipe
[18:50:46 Abhinickz#wsl -> pipe$ pwd | cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/pipe`
Pipe output: /mnt/d/Abhinickz/dev_work/learn_rust/pipe
You can do it in a pretty snazzy and concise way with rust's iterator methods
use std::io::{self, BufRead};
fn main() {
// get piped input
// eg `cat file | ./program`
// ( `cat file | cargo run` also works )
let input = io::stdin().lock().lines().fold("".to_string(), |acc, line| {
acc + &line.unwrap() + "\n"
});
dbg!(input);
}
use std::env;
use std::fs::File;
use std::io::prelude::*;
fn main() {
let args: Vec<String> = env::args().collect();
let filename = &args[1];
let mut f = File::open(filename).expect("file not found");
let mut contents = String::new();
f.read_to_string(&mut contents).expect("something went wrong reading the file");
println!("file content:\n{}", contents);
}
When I attempt to read a GBK encoded file, I get the following error:
thread 'main' panicked at 'something went wrong reading the file: Error { repr: Custom(Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }) }', /checkout/src/libcore/result.rs:860
It says the stream must contain valid UTF-8. How can I read a GBK file?
I figured out how to read line by line from a GBK-encoded file.
extern crate encoding;
use std::env;
use std::fs::File;
use std::io::prelude::*;
use std::io::BufReader;
use encoding::all::GBK;
use encoding::{Encoding, EncoderTrap, DecoderTrap};
fn main() {
let args: Vec<String> = env::args().collect();
let filename = &args[1];
let mut file = File::open(filename).expect("file not found");
let reader = BufReader::new(&file);
let mut lines = reader.split(b'\n').map(|l| l.unwrap());
for line in lines {
let decoded_string = GBK.decode(&line, DecoderTrap::Strict).unwrap();
println!("{}", decoded_string);
}
}
You likely want the encoding crate.