Trying to iterate 2 files in rust - rust

I am trying to read 2 files and compare each item in each file to see if they are equal.
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line = line.unwrap(); // Ignore errors.
for line2 in reader2.lines() {
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
However, this doesn't work. How do I apply a loop over a loop with Buffers?

Your first problem is a duplicate of this question. TLDR: you need to call by_ref if you want to be able to reuse reader2 after calling its lines method (eg. in the next loop iteration).
With that your code will compile but won't work, because once you have processed the first line of the first file you are at the end of the second file, so the second file will appear empty when processing the subsequent lines. You can fix that by rewinding the second file for each line. The minimal set of changes that will make your code work is:
use std::io::Read;
use std::io::Seek;
use std::io::SeekFrom;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
reader2.seek (SeekFrom::Start (0)).unwrap(); // <-- Add this line
for line2 in reader2.by_ref().lines() { // <-- Use by_ref here
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
However this will be pretty slow. You can make it much faster by reading one of the files in a HashSet and checking if each line of the other file is in the set:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>, _>>().unwrap();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
if lines2.contains (&line1) {
println!("{}", line1)
}
}
}
Finally you can also read both files into HashSets and print out the intersection:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let lines1 = reader.lines().collect::<Result<HashSet<_>, _>>().unwrap();
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>, _>>().unwrap();
for l in lines1.intersection (&lines2) {
println!("{}", l)
}
}
As a bonus this last solution will remove duplicate lines. OTOH it won't preserve the order of the lines.

Although I found a solution, it is horribly slow. If anyone has a better solution to find items similar in 2 files please let me know.
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let mut vec2 = findvec("file1.txt".to_string());
let mut vec3 = &findvec("file2.txt".to_string());
for line in vec2 {
for line2 in vec3 {
if line.to_string() == line2.to_string() {
println!("{}",line.to_string());
}
}
}
}
fn findvec(filename: String) -> Vec<String> {
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename).unwrap();
let reader = BufReader::new(file);
// blank vector
let mut myvec = Vec::new();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (index, line) in reader.lines().enumerate() {
let line = line.unwrap(); // Ignore errors.
// Show the line and its number.
myvec.push(line);
}
myvec
}

Related

How to read a text File in Rust and read mutliple Values per line

So basically, I have a text file with the following syntax:
String int
String int
String int
I have an idea how to read the Values if there is only one entry per line, but if there are multiple, I do not know how to do it.
In Java, I would do something simple with while and Scanner but in Rust I have no clue.
I am fairly new to Rust so please help me.
Thanks for your help in advance
Solution
Here is my modified Solution of #netwave 's code:
use std::fs;
use std::io::{BufRead, BufReader, Error};
fn main() -> Result<(), Error> {
let buff_reader = BufReader::new(fs::File::open(file)?);
for line in buff_reader.lines() {
let parsed = sscanf::scanf!(line?, "{} {}", String, i32);
println!("{:?}\n", parsed);
}
Ok(())
}
You can use the BuffRead trait, which has a read_line method. Also you can use lines.
For doing so the easiest option would be to wrap the File instance with a BuffReader:
use std::fs;
use std::io::{BufRead, BufReader};
...
let buff_reader = BufReader::new(fs::File::open(path)?);
loop {
let mut buff = String::new();
buff_reader.read_line(&mut buff)?;
println!("{}", buff);
}
Playground
Once you have each line you can easily use sscanf crate to parse the line to the types you need:
let parsed = sscanf::scanf!(buff, "{} {}", String, i32);
Based on: https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html
For data.txt to contain:
str1 100
str2 200
str3 300
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
fn main() {
// File hosts must exist in current path before this produces output
if let Ok(lines) = read_lines("./data.txt") {
// Consumes the iterator, returns an (Optional) String
for line in lines {
if let Ok(data) = line {
let values: Vec<&str> = data.split(' ').collect();
match values.len() {
2 => {
let strdata = values[0].parse::<String>();
let intdata = values[1].parse::<i32>();
println!("Got: {:?} {:?}", strdata, intdata);
},
_ => panic!("Invalid input line {}", data),
};
}
}
}
}
// The output is wrapped in a Result to allow matching on errors
// Returns an Iterator to the Reader of the lines of the file.
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
Outputs:
Got: Ok("str1") Ok(100)
Got: Ok("str2") Ok(200)
Got: Ok("str3") Ok(300)

how to pass every line from a text file as an argument in rust

i have made this code to check for alive urls in a text file it was first to check for a single url the script worked but then i wanted to make it multithreaded i got this error
error
here is the original code :
use hyper_tls::HttpsConnector;
use hyper::Client;
use tokio::io::BufReader;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
let https = HttpsConnector::new();
let url = std::env::args().nth(1).expect("no list given");
let client = Client::builder().build::<_, hyper::Body>(https);
let reader = BufReader::new(url);
let lines = reader.lines();
for l in lines {
let sep = l.parse()?;
// Await the response...
let resp = client.get(sep).await?;
if resp.status() == 200 {
println!("{}", l);}
if resp.status() == 301 {
println!("{}", l); }
}
Ok(())
}
the issue seems to be that you are passing in the file's name as opposed to its content to the BufReader.
In order to read the contents instead, you can use a tokio::fs:File.
Here's an example of reading a file and printing its lines to stdout using tokio and a BufReader:
use tokio::{
fs::File,
io::{
// This trait needs to be imported, as the lines function being
// used on reader is defined there
AsyncBufReadExt,
BufReader
}
};
#[tokio::main]
async fn main() {
// get file command line argument
let file_argument = std::env::args().nth(1).expect("Please provide a file as command line argument.");
// open file
let file = File::open(file_argument).await.expect("Failed to open file");
// create reader using file
let reader = BufReader::new(file);
// get iterator over lines
let mut lines = reader.lines();
// this has to be used instead of a for loop, since lines isn't a
// normal iterator, but a Lines struct, the next element of which
// can be obtained using the next_line function.
while let Some(line) = lines.next_line().await.expect("Failed to read file") {
// print current line
println!("{}", line);
}
}

Why do I get an empty vector when splitting a file by lines and then words?

I'm trying to read a text file with Rust where each line has two words separated by a whitespace. I have to get the length of the first word:
use std::fs;
fn main() {
let contents = fs::read_to_string("src/input.txt").expect("Wrong file name!");
for line in contents.split("\n") {
let tokens: Vec<&str> = line.split_whitespace().collect();
println!("{}", tokens[0].len());
}
}
The content of the input.txt file is:
monk perl
I'm running on Windows and using cargo run. I get the following error (because of tokens[0].len()):
4
thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0'
I don't know what's wrong with my code. The file "input.txt" is not empty.
By using .split("\n"), you are getting two items in the iterator. One is the line that you expect, the other is the empty string after the newline. The empty string, when split into words, is empty. This means the vector will be empty and there is not an item at index 0.
Use str::lines instead:
use std::fs;
fn main() {
let contents = fs::read_to_string("src/input.txt").expect("AWrong file name!");
for line in contents.lines() {
let tokens: Vec<_> = line.split_whitespace().collect();
println!("{}", tokens[0].len());
}
}
See also:
Is this the right way to read lines from file and split them into words in Rust?
use std::fs;
fn main() {
let contents = fs::read_to_string("text.txt").expect("Wrong file name!");
for line in contents.split("\n") {
let tokens: Vec<&str> = line.split_whitespace().collect();
if !tokens.is_empty() { // necessary because of contents.split("\n")
let word_list = read_dictionary(tokens[0].len());
println!("{}", tokens[0].len());
}
}
}

How can I read a file line-by-line, eliminate duplicates, then write back to the same file?

I want to read a file, eliminate all duplicates and write the rest back into the file - like a duplicate cleaner.
Vec because a normal array has a fixed size but my .txt is flexible (am I doing this right?).
Read, lines in Vec + delete duplices:
Missing write back to file.
use std::io;
fn main() {
let path = Path::new("test.txt");
let mut file = io::BufferedReader::new(io::File::open(&path, R));
let mut lines: Vec<String> = file.lines().map(|x| x.unwrap()).collect();
// dedup() deletes all duplicates if sort() before
lines.sort();
lines.dedup();
for e in lines.iter() {
print!("{}", e.as_slice());
}
}
Read + write to file (untested but should work I guess).
Missing lines to Vec because it doesn't work without BufferedReader as it seems (or I'm doing something else wrong, also a good chance).
use std::io;
fn main() {
let path = Path::new("test.txt");
let mut file = match io::File::open_mode(&path, io::Open, io::ReadWrite) {
Ok(f) => f,
Err(e) => panic!("file error: {}", e),
};
let mut lines: Vec<String> = file.lines().map(|x| x.unwrap()).collect();
lines.sort();
// dedup() deletes all duplicates if sort() before
lines.dedup();
for e in lines.iter() {
file.write("{}", e);
}
}
So .... how do I get those 2 together? :)
Ultimately, you are going to run into a problem: you are trying to write to the same file you are reading from. In this case, it's safe because you are going to read the entire file, so you don't need it after that. However, if you did try to write to the file, you'd see that opening a file for reading doesn't allow writing! Here's the code to do that:
use std::{
fs::File,
io::{BufRead, BufReader, Write},
};
fn main() {
let mut file = File::open("test.txt").expect("file error");
let reader = BufReader::new(&mut file);
let mut lines: Vec<_> = reader
.lines()
.map(|l| l.expect("Couldn't read a line"))
.collect();
lines.sort();
lines.dedup();
for line in lines {
file.write_all(line.as_bytes())
.expect("Couldn't write to file");
}
}
Here's the output:
% cat test.txt
a
a
b
a
% cargo run
thread 'main' panicked at 'Couldn't write to file: Os { code: 9, kind: Other, message: "Bad file descriptor" }', src/main.rs:12:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
You could open the file for both reading and writing:
use std::{
fs::OpenOptions,
io::{BufRead, BufReader, Write},
};
fn main() {
let mut file = OpenOptions::new()
.read(true)
.write(true)
.open("test.txt")
.expect("file error");
// Remaining code unchanged
}
But then you'd see that (a) the output is appended and (b) all the newlines are lost on the new lines because BufRead doesn't include them.
We could reset the file pointer back to the beginning, but then you'd probably leave trailing stuff at the end (deduplicating is likely to have less bytes written than read). It's easier to just reopen the file for writing, which will truncate the file. Also, let's use a set data structure to do the deduplication for us!
use std::{
collections::BTreeSet,
fs::File,
io::{BufRead, BufReader, Write},
};
fn main() {
let file = File::open("test.txt").expect("file error");
let reader = BufReader::new(file);
let lines: BTreeSet<_> = reader
.lines()
.map(|l| l.expect("Couldn't read a line"))
.collect();
let mut file = File::create("test.txt").expect("file error");
for line in lines {
file.write_all(line.as_bytes())
.expect("Couldn't write to file");
file.write_all(b"\n").expect("Couldn't write to file");
}
}
And the output:
% cat test.txt
a
a
b
a
a
b
a
b
% cargo run
% cat test.txt
a
b
The less-efficient but shorter solution is to read the entire file as one string and use str::lines:
use std::{
collections::BTreeSet,
fs::{self, File},
io::Write,
};
fn main() {
let contents = fs::read_to_string("test.txt").expect("can't read");
let lines: BTreeSet<_> = contents.lines().collect();
let mut file = File::open("test.txt").expect("can't create");
for line in lines {
writeln!(file, "{}", line).expect("can't write");
}
}
See also:
What's the de-facto way of reading and writing files in Rust 1.x?
What is the best variant for appending a new line in a text file?

How to combine reading a file line by line and iterating over each character in each line?

I started from this code, which just reads every line in a file, and which works well:
use std::io::{BufRead, BufReader};
use std::fs::File;
fn main() {
let file = File::open("chry.fa").expect("cannot open file");
let file = BufReader::new(file);
for line in file.lines() {
print!("{}", line.unwrap());
}
}
... but then I tried to also loop over each character in each line, something like this:
use std::io::{BufRead, BufReader};
use std::fs::File;
fn main() {
let file = File::open("chry.fa").expect("cannot open file");
let file = BufReader::new(file);
for line in file.lines() {
for c in line.chars() {
print!("{}", c.unwrap());
}
}
}
... but it turns out that this innermost for loop is not correct. I get the following error message:
error[E0599]: no method named `chars` found for type `std::result::Result<std::string::String, std::io::Error>` in the current scope
--> src/main.rs:8:23
|
8 | for c in line.chars() {
| ^^^^^
You need to handle the potential error that could arise from each IO operation, represented by an io::Result which can contain either the requested data or an error. There are different ways to handle errors.
One way is to just ignore them and read whatever data we can get.
The code shows how this can be done:
use std::io::{BufRead, BufReader};
use std::fs::File;
fn main() {
let file = File::open("chry.fa").expect("cannot open file");
let file = BufReader::new(file);
for line in file.lines().filter_map(|result| result.ok()) {
for c in line.chars() {
print!("{}", c);
}
}
}
The key points: file.lines() is an iterator that yields io::Result. In the filter_map, we convert the io::Result into an Option and filter any occurrences of None. We're then left with just plain lines (i.e. strings).

Resources