I'm creating a small application that explores variable lifetimes and threads. I want to load in a file once, and then use its contents (in this case an audio file) in a separate channel. I am having issues with value lifetimes.
I'm almost certain the syntax is wrong for what I have so far (for creating a static variable), but I can't find any resources for File types and lifetimes. What I have thus far produces this error:
let file = &File::open("src/censor-beep-01.wav").unwrap();
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
let x: &'static File = file;
------------- type annotation requires that borrow lasts for `'static`
The code I currently have is:
#![allow(dead_code)]
#![allow(unused_imports)]
#![allow(unused_must_use)]
#![allow(unused_variables)]
use std::io::{self, BufRead, BufReader, stdin, Read};
use std::sync::mpsc::{self, TryRecvError};
use std::thread;
use std::time::Duration;
use std::fs::File;
use std::rc::Rc;
use rodio::Source;
fn main() {
let file = &File::open("src/censor-beep-01.wav").unwrap();
let x: &'static File = file;
loop {
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
let tmp = x;
let (stream, stream_handle) = rodio::OutputStream::try_default().unwrap();
let source = rodio::Decoder::new(BufReader::new(tmp)).unwrap();
stream_handle.play_raw(source.convert_samples());
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => {
break;
}
Err(TryRecvError::Empty) => {
println!("z");
thread::sleep(Duration::from_millis(1000));
}
}
});
let mut line = String::new();
let stdin = io::stdin();
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
return;
}
}
You need to wrap the file with Arc and Mutex like Arc::new(Mutex::new(file)) and then clone the file before passing it to the thread.
Arc is used for reference counting, which is needed to share the target object (in your case it is a file) across the thread and Mutex is needed to access the target object synchronously.
sample code (I have simplified your code to make it more understandable):
let file = Arc::new(Mutex::new(File::open("src/censor-beep-01.wav").unwrap()));
loop {
let file = file.clone();
thread::spawn(move || loop {
let mut file_guard = match file.lock() {
Ok(guard) => guard,
Err(poison) => poison.into_inner()
};
let file = file_guard.deref();
// now you can pass above file object to BufReader like "BufReader::new(file)"
});
}
reason for creates a temporary which is freed while still in use error:
You have only stored the reference of the file without the actual file object. so, the object will be droped in that line itself.
Related
I'm new to Rust and am likely have a huge knowledge gap. Basically, I'm hoping to be create a utility function that would except a regular text file or a ZIP file and return a BufRead where the caller can start processing line by line. It is working well for non ZIP files but I am not understanding how to achieve the same for the ZIP files. The ZIP files will only contain a single file within the archive which is why I'm only processing the first file in the ZipArchive.
I'm running into the the following error.
error[E0515]: cannot return value referencing local variable `archive_contents`
--> src/file_reader.rs:30:9
|
27 | let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
| ---------------- `archive_contents` is borrowed here
...
30 | Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub struct FileReader {
pub file_reader: Result<Box<BufRead>, &'static str>,
}
pub fn file_reader(filename: &str) -> Result<Box<BufRead>, &'static str> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => panic!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
),
};
if path.extension() == Some(OsStr::new("zip")) {
// Processing ZIP file.
let mut archive_contents: zip::read::ZipArchive<std::fs::File> =
zip::ZipArchive::new(file).unwrap();
let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
// ERRORS: returns a value referencing data owned by the current function
Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
main.rs
mod file_reader;
use std::io::BufRead;
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
let mut fr = match file_reader::file_reader(&f) {
Ok(fr) => fr,
Err(e) => panic!("Error reading file."),
};
fr.lines().for_each(|l| match l {
Ok(l) => {
println!("{}", l);
}
Err(e) => {
println!("ERROR: Failed to read line:\n {}", e);
}
});
}
}
Any help is greatly appreciated!
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
You have to restructure the code somehow. The issue here is that, well, the archive data is part of the archive. So unlike file, archive_file is not an independent item, it is rather a pointer of sort into the archive itself. Which means the archive needs to live longer than archive_file for this code to be correct.
In a GC'd language this isn't an issue, archive_file has a reference to archive and will keep it alive however long it needs. Not so for Rust.
A simple way to fix this would be to just copy the data out of archive_file and into an owned buffer you can return to the parent. An other option might be to return a wrapper for (archive_contents, item_index), which would delegate the reading (might be somewhat tricky though). Yet another would be to not have file_reader.
Thanks to #Masklinn for the direction! Here's the working solution using their suggestion.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Cursor;
use std::io::Error;
use std::io::Read;
use std::path::Path;
use zip::read::ZipArchive;
pub fn file_reader(filename: &str) -> Result<Box<dyn BufRead>, Error> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(why),
};
if path.extension() == Some(OsStr::new("zip")) {
let mut archive_contents = ZipArchive::new(file)?;
let mut archive_file = archive_contents.by_index(0)?;
// Read the contents of the file into a vec.
let mut data = Vec::new();
archive_file.read_to_end(&mut data)?;
// Wrap vec in a std::io::Cursor.
let cursor = Cursor::new(data);
Ok(Box::new(cursor))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
While the solution you have settled on does work, it has a few disadvantages. One is that when you read from a zip file, you have to read the contents of the file you want to process into memory before proceeding, which might be impractical for a large file. Another is that you have to heap allocate the BufReader in either case.
Another possibly more idiomatic solution is to restructure your code, such that the BufReader does not need to be returned from the function at all - rather, structure your code so that it has a function that opens the file, which in turn calls a function that processes the file:
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub fn process_file(filename: &str) -> Result<usize, String> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(format!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
)),
};
if path.extension() == Some(OsStr::new("zip")) {
// Handling a zip file
let mut archive_contents=zip::ZipArchive::new(file).unwrap();
let mut buf_reader = BufReader::with_capacity(128 * 1024,archive_contents.by_index(0).unwrap());
process_reader(&mut buf_reader)
} else {
// Handling a plain file.
process_reader(&mut BufReader::with_capacity(128 * 1024, file))
}
}
pub fn process_reader(reader: &mut dyn BufRead) -> Result<usize, String> {
// Example, just count the number of lines
return Ok(reader.lines().count());
}
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
match process_file(&f) {
Ok(count) => println!("File {} Count: {}", &f, count),
Err(e) => println!("Error reading file: {}", e),
};
}
}
This way, you don't need any Boxes and you don't need to read the file into memory before processing it.
A drawback to this solution would if you had multiple functions that need to be able to read from zip files. One way to handle that would be to define process_file to take a callback function to do the processing. First you would change the definition of process_file to be:
pub fn process_file<C>(filename: &str, process_reader: C) -> Result<usize, String>
where C: FnOnce(&mut dyn BufRead)->Result<usize, String>
The rest of the function body can be left unchanged. Now, process_reader can be passed into the function, like this:
process_file(&f, count_lines)
where count_lines would be the original simple function to count the lines, for instance.
This would also allow you to pass in a closure:
process_file(&f, |reader| Ok(reader.lines().count()))
Is there a good way to handle the ownership of a file held within a struct using Rust? As a stripped down example, consider:
// Buffered file IO
use std::io::{BufReader,BufRead};
use std::fs::File;
// Structure that contains a file
#[derive(Debug)]
struct Foo {
file : BufReader <File>,
data : Vec <f64>,
}
// Reads the file and strips the header
fn init_foo(fname : &str) -> Foo {
// Open the file
let mut file = BufReader::new(File::open(fname).unwrap());
// Dump the header
let mut header = String::new();
let _ = file.read_line(&mut header);
// Return our foo
Foo { file : file, data : Vec::new() }
}
// Read the remaining foo data and process it
fn read_foo(mut foo : Foo) -> Foo {
// Strip one more line
let mut header_alt = String::new();
let _ = foo.file.read_line(&mut header_alt);
// Read in the rest of the file line by line
let mut data = Vec::new();
for (lineno,line) in foo.file.lines().enumerate() {
// Strip the error
let line = line.unwrap();
// Print some diagnostic information
println!("Line {}: val {}",lineno,line);
// Save the element
data.push(line.parse::<f64>().unwrap());
}
// Export foo
Foo { data : data, ..foo}
}
fn main() {
// Initialize our foo
let foo = init_foo("foo.txt");
// Read in our data
let foo = read_foo(foo);
// Print out some debugging info
println!("{:?}",foo);
}
This currently gives the compilation error:
error[E0382]: use of moved value: `foo.file`
--> src/main.rs:48:5
|
35 | for (lineno,line) in foo.file.lines().enumerate() {
| -------- value moved here
...
48 | Foo { data : data, ..foo}
| ^^^^^^^^^^^^^^^^^^^^^^^^^ value used here after move
|
= note: move occurs because `foo.file` has type `std::io::BufReader<std::fs::File>`, which does not implement the `Copy` trait
error: aborting due to previous error
For more information about this error, try `rustc --explain E0382`.
error: Could not compile `rust_file_struct`.
To learn more, run the command again with --verbose.
And, to be sure, this makes sense. Here, lines() takes ownership of the buffered file, so we can't use the value in the return. What's confusing me is a better way to handle this situation. Certainly, after the for loop, the file is consumed, so it really can't be used. To better denote this, we could represent file as Option <BufReader <File>>. However, this causes some grief because the second read_line call, inside of read_foo, needs a mutable reference to file and I'm not sure how to obtain one it it's wrapped inside of an Option. Is there a good way of handling the ownership?
To be clear, this is a stripped down example. In the actual use case, there are several files as well as other data. I've things structured in this way because it represents a configuration that comes from the command line options. Some of the options are files, some are flags. In either case, I'd like to do some processing, but not all, of the files early in order to throw the appropriate errors.
I think you're on track with using the Option within the Foo struct. Assuming the struct becomes:
struct Foo {
file : Option<BufReader <File>>,
data : Vec <f64>,
}
The following code is a possible solution:
// Reads the file and strips the header
fn init_foo(fname : &str) -> Foo {
// Open the file
let mut file = BufReader::new(File::open(fname).unwrap());
// Dump the header
let mut header = String::new();
let _ = file.read_line(&mut header);
// Return our foo
Foo { file : Some(file), data : Vec::new() }
}
// Read the remaining foo data and process it
fn read_foo(foo : Foo) -> Option<Foo> {
let mut file = foo.file?;
// Strip one more line
let mut header_alt = String::new();
let _ = file.read_line(&mut header_alt);
// Read in the rest of the file line by line
let mut data = Vec::new();
for (lineno,line) in file.lines().enumerate() {
// Strip the error
let line = line.unwrap();
// Print some diagnostic information
println!("Line {}: val {}",lineno,line);
// Save the element
data.push(line.parse::<f64>().unwrap());
}
// Export foo
Some(Foo { data : data, file: None})
}
Note in this case that read_foo returns an optional Foo due to the fact that the file could be None.
On a side note, IMO, unless you absolutely need the BufReader to be travelling along with the Foo, I would discard it. As you've already found, calling lines causes a move, which makes it difficult to retain within another struct. As a suggestion, you could make the file field simply a String so that you could always derive the BufReader and read the file when needed.
For example, here's a solution where a file name (i.e. a &str) can be turned into a Foo with all the line processing done just before the construction of the struct.
// Buffered file IO
use std::io::{BufReader,BufRead};
use std::fs::File;
// Structure that contains a file
#[derive(Debug)]
struct Foo {
file : String,
data : Vec <f64>,
}
trait IntoFoo {
fn into_foo(self) -> Foo;
}
impl IntoFoo for &str {
fn into_foo(self) -> Foo {
// Open the file
let mut file = BufReader::new(File::open(self).unwrap());
// Dump the header
let mut header = String::new();
let _ = file.read_line(&mut header);
// Strip one more line
let mut header_alt = String::new();
let _ = file.read_line(&mut header_alt);
// Read in the rest of the file line by line
let mut data = Vec::new();
for (lineno,line) in file.lines().enumerate() {
// Strip the error
let line = line.unwrap();
// Print some diagnostic information
println!("Line {}: val {}",lineno,line);
// Save the element
data.push(line.parse::<f64>().unwrap());
}
Foo { file: self.to_string(), data }
}
}
fn main() {
// Read in our data from the file
let foo = "foo.txt".into_foo();
// Print out some debugging info
println!("{:?}",foo);
}
In this case, there's no need to worry about the ownership of the BufReader because it's created, used, and discarded in the same function. Of course, I don't fully know your use case, so this may not be suitable for your implementation.
With Rust being comparatively new, I've seen far too many ways of reading and writing files. Many are extremely messy snippets someone came up with for their blog, and 99% of the examples I've found (even on Stack Overflow) are from unstable builds that no longer work. Now that Rust is stable, what is a simple, readable, non-panicking snippet for reading or writing files?
This is the closest I've gotten to something that works in terms of reading a text file, but it's still not compiling even though I'm fairly certain I've included everything I should have. This is based off of a snippet I found on Google+ of all places, and the only thing I've changed is that the old BufferedReader is now just BufReader:
use std::fs::File;
use std::io::BufReader;
use std::path::Path;
fn main() {
let path = Path::new("./textfile");
let mut file = BufReader::new(File::open(&path));
for line in file.lines() {
println!("{}", line);
}
}
The compiler complains:
error: the trait bound `std::result::Result<std::fs::File, std::io::Error>: std::io::Read` is not satisfied [--explain E0277]
--> src/main.rs:7:20
|>
7 |> let mut file = BufReader::new(File::open(&path));
|> ^^^^^^^^^^^^^^
note: required by `std::io::BufReader::new`
error: no method named `lines` found for type `std::io::BufReader<std::result::Result<std::fs::File, std::io::Error>>` in the current scope
--> src/main.rs:8:22
|>
8 |> for line in file.lines() {
|> ^^^^^
To sum it up, what I'm looking for is:
brevity
readability
covers all possible errors
doesn't panic
None of the functions I show here panic on their own, but I am using expect because I don't know what kind of error handling will fit best into your application. Go read The Rust Programming Language's chapter on error handling to understand how to appropriately handle failure in your own program.
Rust 1.26 and onwards
If you don't want to care about the underlying details, there are one-line functions for reading and writing.
Read a file to a String
use std::fs;
fn main() {
let data = fs::read_to_string("/etc/hosts").expect("Unable to read file");
println!("{}", data);
}
Read a file as a Vec<u8>
use std::fs;
fn main() {
let data = fs::read("/etc/hosts").expect("Unable to read file");
println!("{}", data.len());
}
Write a file
use std::fs;
fn main() {
let data = "Some data!";
fs::write("/tmp/foo", data).expect("Unable to write file");
}
Rust 1.0 and onwards
These forms are slightly more verbose than the one-line functions that allocate a String or Vec for you, but are more powerful in that you can reuse allocated data or append to an existing object.
Reading data
Reading a file requires two core pieces: File and Read.
Read a file to a String
use std::fs::File;
use std::io::Read;
fn main() {
let mut data = String::new();
let mut f = File::open("/etc/hosts").expect("Unable to open file");
f.read_to_string(&mut data).expect("Unable to read string");
println!("{}", data);
}
Read a file as a Vec<u8>
use std::fs::File;
use std::io::Read;
fn main() {
let mut data = Vec::new();
let mut f = File::open("/etc/hosts").expect("Unable to open file");
f.read_to_end(&mut data).expect("Unable to read data");
println!("{}", data.len());
}
Write a file
Writing a file is similar, except we use the Write trait and we always write out bytes. You can convert a String / &str to bytes with as_bytes:
use std::fs::File;
use std::io::Write;
fn main() {
let data = "Some data!";
let mut f = File::create("/tmp/foo").expect("Unable to create file");
f.write_all(data.as_bytes()).expect("Unable to write data");
}
Buffered I/O
I felt a bit of a push from the community to use BufReader and BufWriter instead of reading straight from a file
A buffered reader (or writer) uses a buffer to reduce the number of I/O requests. For example, it's much more efficient to access the disk once to read 256 bytes instead of accessing the disk 256 times.
That being said, I don't believe a buffered reader/writer will be useful when reading the entire file. read_to_end seems to copy data in somewhat large chunks, so the transfer may already be naturally coalesced into fewer I/O requests.
Here's an example of using it for reading:
use std::fs::File;
use std::io::{BufReader, Read};
fn main() {
let mut data = String::new();
let f = File::open("/etc/hosts").expect("Unable to open file");
let mut br = BufReader::new(f);
br.read_to_string(&mut data).expect("Unable to read string");
println!("{}", data);
}
And for writing:
use std::fs::File;
use std::io::{BufWriter, Write};
fn main() {
let data = "Some data!";
let f = File::create("/tmp/foo").expect("Unable to create file");
let mut f = BufWriter::new(f);
f.write_all(data.as_bytes()).expect("Unable to write data");
}
A BufReader is more useful when you want to read line-by-line:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let f = File::open("/etc/hosts").expect("Unable to open file");
let f = BufReader::new(f);
for line in f.lines() {
let line = line.expect("Unable to read line");
println!("Line: {}", line);
}
}
For anybody who is writing to a file, the accepted answer is good but if you need to append to the file you have to use the OpenOptions struct instead:
use std::io::Write;
use std::fs::OpenOptions;
fn main() {
let data = "Some data!\n";
let mut f = OpenOptions::new()
.append(true)
.create(true) // Optionally create the file if it doesn't already exist
.open("/tmp/foo")
.expect("Unable to open file");
f.write_all(data.as_bytes()).expect("Unable to write data");
}
Buffered writing still works the same way:
use std::io::{BufWriter, Write};
use std::fs::OpenOptions;
fn main() {
let data = "Some data!\n";
let f = OpenOptions::new()
.append(true)
.open("/tmp/foo")
.expect("Unable to open file");
let mut f = BufWriter::new(f);
f.write_all(data.as_bytes()).expect("Unable to write data");
}
By using the Buffered I/O you can copy the file size is greater than the actual memory.
use std::fs::{File, OpenOptions};
use std::io::{BufReader, BufWriter, Write, BufRead};
fn main() {
let read = File::open(r#"E:\1.xls"#);
let write = OpenOptions::new().write(true).create(true).open(r#"E:\2.xls"#);
let mut reader = BufReader::new(read.unwrap());
let mut writer = BufWriter::new(write.unwrap());
let mut length = 1;
while length > 0 {
let buffer = reader.fill_buf().unwrap();
writer.write(buffer);
length = buffer.len();
reader.consume(length);
}
}
I am reading a string from a file, splitting it by lines into a vector and then I want to do something with the extracted lines in separate threads. Like this:
use std::fs::File;
use std::io::prelude::*;
use std::thread;
fn main() {
match File::open("data") {
Ok(mut result) => {
let mut s = String::new();
result.read_to_string(&mut s);
let k : Vec<_> = s.split("\n").collect();
for line in k {
thread::spawn(move || {
println!("nL: {:?}", line);
});
}
}
Err(err) => {
println!("Error {:?}",err);
}
}
}
Of course this throws an error, because s will go out of scope before the threads are started:
s` does not live long enough
main.rs:9 let k : Vec<_> = s.split("\n").collect();
^
What can I do now? I've tried many things like Box or Arc, but I couldn't get it working. I somehow need to create a copy of s which also lives in the threads. But how do I do that?
The problem, fundamentally, is that line is a borrowed slice into s. There's really nothing you can do here, since there's no way to guarantee that each line will not outlive s itself.
Also, just to be clear: there is absolutely no way in Rust to "extend the lifetime of a variable". It simply cannot be done.
The simplest way around this is to go from line being borrowed to owned. Like so:
use std::thread;
fn main() {
let mut s: String = "One\nTwo\nThree\n".into();
let k : Vec<String> = s.split("\n").map(|s| s.into()).collect();
for line in k {
thread::spawn(move || {
println!("nL: {:?}", line);
});
}
}
The .map(|s| s.into()) converts from &str to String. Since a String owns its contents, it can be safely moved into each thread's closure, and will live independently of the thread that created it.
Note: you could do this in nightly Rust using the new scoped thread API, but that is still unstable.
I am trying to read the contents of files in a directory in parallel. I'm running into lifetime issues.
My code looks like this:
use std::io::fs;
use std::io;
use std::collections::HashMap;
use std::comm;
use std::io::File;
fn main() {
let (tx, rx) = comm::channel(); // (Sender, Receiver)
let paths = fs::readdir(&Path::new("resources/tests")).unwrap();
for path in paths.iter() {
let task_tx = tx.clone();
spawn(proc() {
match File::open(path).read_to_end() {
Ok(data) => task_tx.send((path.filename_str().unwrap(), data)),
Err(e) => fail!("Could not read one of the files! Error: {}", e)
};
});
}
let mut results = HashMap::new();
for _ in range(0, paths.len()) {
let (filename, data) = rx.recv();
results.insert(filename, data);
}
println!("{}", results);
}
The compilation error I'm getting is:
error: paths does not live long enough
note: reference must be valid for the static lifetime...
note: ...but borrowed value is only valid for the block at 7:19
I also tried to use into_iter() (or move_iter() previously) in the loop without much success.
I'm suspecting it has to do with the spawned tasks remaining alive beyond the entire main() scope, but I don't know how I can fix this situation.
The error message might be a bit confusing but what it's telling you is that you are trying to use a reference path inside of a task.
Because spawn is using proc you can only use data that you can transfer ownership of to that task (Send kind).
To solve that you can do this (you could use a move_iter but then you can't access paths after the loop):
for path in paths.iter() {
let task_tx = tx.clone();
let p = path.clone();
spawn(proc() {
match File::open(&p).read_to_end() {
The second problem is that you are trying to send &str (filename) over a channel. Same as for tasks types used must be of kind Send:
match File::open(&p).read_to_end() {
Ok(data) => task_tx.send((p.filename_str().unwrap().to_string(), data)),
Err(e) => fail!("Could not read one of the files! Error: {}", e)
};