Writing to a file or String in Rust - rust

TL;DR: I want to implement trait std::io::Write that outputs to a memory buffer, ideally String, for unit-testing purposes.
I must be missing something simple.
Similar to another question, Writing to a file or stdout in Rust, I am working on a code that can work with any std::io::Write implementation.
It operates on structure defined like this:
pub struct MyStructure {
writer: Box<dyn Write>,
}
Now, it's easy to create instance writing to either a file or stdout:
impl MyStructure {
pub fn use_stdout() -> Self {
let writer = Box::new(std::io::stdout());
MyStructure { writer }
}
pub fn use_file<P: AsRef<Path>>(path: P) -> Result<Self> {
let writer = Box::new(File::create(path)?);
Ok(MyStructure { writer })
}
pub fn printit(&mut self) -> Result<()> {
self.writer.write(b"hello")?;
Ok(())
}
}
But for unit testing, I also need to have a way to run the business logic (here represented by method printit()) and trap its output, so that its content can be checked in the test.
I cannot figure out how to implement this. This playground code shows how I would like to use it, but it does not compile because it breaks borrowing rules.
// invalid code - does not compile!
fn main() {
let mut buf = Vec::new(); // This buffer should receive output
let mut x2 = MyStructure { writer: Box::new(buf) };
x2.printit().unwrap();
// now, get the collected output
let output = std::str::from_utf8(buf.as_slice()).unwrap().to_string();
// here I want to analyze the output, for instance in unit-test asserts
println!("Output to string was {}", output);
}
Any idea how to write the code correctly? I.e., how to implement a writer on top of a memory structure (String, Vec, ...) that can be accessed afterwards?

Something like this does work:
let mut buf = Vec::new();
{
// Use the buffer by a mutable reference
//
// Also, we're doing it inside another scope
// to help the borrow checker
let mut x2 = MyStructure { writer: Box::new(&mut buf) };
x2.printit().unwrap();
}
let output = std::str::from_utf8(buf.as_slice()).unwrap().to_string();
println!("Output to string was {}", output);
However, in order for this to work, you need to modify your type and add a lifetime parameter:
pub struct MyStructure<'a> {
writer: Box<dyn Write + 'a>,
}
Note that in your case (where you omit the + 'a part) the compiler assumes that you use 'static as the lifetime of the trait object:
// Same as your original variant
pub struct MyStructure {
writer: Box<dyn Write + 'static>
}
This limits the set of types which could be used here, in particular, you cannot use any kinds of borrowed references. Therefore, for maximum genericity we have to be explicit here and define a lifetime parameter.
Also note that depending on your use case, you can use generics instead of trait objects:
pub struct MyStructure<W: Write> {
writer: W
}
In this case the types are fully visible at any point of your program, and therefore no additional lifetime annotation is needed.

Related

How to store an iterator over stdin in a structure?

I created a structure where an iterator over a file or stdin should be stored, but compiler yelling at me :)
I decided that Lines is the struct I need to store in my struct to iterate using it later and Box will allow to store variable with unknown size, so I define my structure like that:
pub struct A {
pub input: Box<Lines<BufRead>>,
}
I want to do something like this later:
let mut a = A {
input: /* don't know what should be here yet */,
};
if something {
a.input = Box::new(io::stdin().lock().lines());
} else {
a.input = Box::new(BufReader::new(file).lines());
}
And finally
for line in a.input {
// ...
}
But I got an error from the compiler
error[E0277]: the size for values of type `(dyn std::io::BufRead + 'static)` cannot be known at compilation time
--> src/context.rs:11:5
|
11 | pub input: Box<Lines<BufRead>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `(dyn std::io::BufRead + 'static)`
= note: to learn more, visit <https://doc.rust-lang.org/book/second-edition/ch19-04-advanced-types.html#dynamically-sized-types-and-sized>
= note: required by `std::io::Lines`
How can I achieve my goal?
The most generic answer to your question is that you don't / can't. Locking stdin returns a type that references the Stdin value. You cannot create a local value (stdin()), take a reference to it (.lock()), and then return that reference.
If you just want to do this inside of a function without returning it, then you can create a trait object:
use std::io::{self, prelude::*, BufReader};
fn example(file: Option<std::fs::File>) {
let stdin;
let mut stdin_lines;
let mut file_lines;
let input: &mut Iterator<Item = _> = match file {
None => {
stdin = io::stdin();
stdin_lines = stdin.lock().lines();
&mut stdin_lines
}
Some(file) => {
file_lines = BufReader::new(file).lines();
&mut file_lines
}
};
for line in input {
// ...
}
}
Or create a new generic function that you can pass either type of concrete iterator to:
use std::io::{self, prelude::*, BufReader};
fn example(file: Option<std::fs::File>) {
match file {
None => finally(io::stdin().lock().lines()),
Some(file) => finally(BufReader::new(file).lines()),
}
}
fn finally(input: impl Iterator<Item = io::Result<String>>) {
for line in input {
// ...
}
}
You could put either the trait object or the generic type into a structure even though you can't return it:
struct A<'a> {
input: &mut Iterator<Item = io::Result<String>>,
}
struct A<I>
where
I: Iterator<Item = io::Result<String>>,
{
input: I,
}
If you are feeling adventurous, you might be able to use some unsafe code / crates wrapping unsafe code to store the Stdin value and the iterator referencing it together, which is not universally safe.
See also:
Is there a way to use locked standard input and output in a constructor to live as long as the struct you're constructing?
Is there any way to return a reference to a variable created in a function?
Why can't I store a value and a reference to that value in the same struct?
How can I store a Chars iterator in the same struct as the String it is iterating on?
Are polymorphic variables allowed?
input: Box<Lines<BufRead>>,
This is invalid because Lines is not a trait. You want either:
use std::io::{prelude::*, Lines};
pub struct A {
pub input: Lines<Box<BufRead>>,
}
Or
use std::io;
pub struct A {
pub input: Box<Iterator<Item = io::Result<String>>>,
}

Discerning lifetimes understanding the move keyword

I've been playing around with AudioUnit via Rust and the Rust library coreaudio-rs. Their example seems to work well:
extern crate coreaudio;
use coreaudio::audio_unit::{AudioUnit, IOType};
use coreaudio::audio_unit::render_callback::{self, data};
use std::f32::consts::PI;
struct Iter {
value: f32,
}
impl Iterator for Iter {
type Item = [f32; 2];
fn next(&mut self) -> Option<[f32; 2]> {
self.value += 440.0 / 44_100.0;
let amp = (self.value * PI * 2.0).sin() as f32 * 0.15;
Some([amp, amp])
}
}
fn main() {
run().unwrap()
}
fn run() -> Result<(), coreaudio::Error> {
// 440hz sine wave generator.
let mut samples = Iter { value: 0.0 };
//let buf: Vec<[f32; 2]> = vec![[0.0, 0.0]];
//let mut samples = buf.iter();
// Construct an Output audio unit that delivers audio to the default output device.
let mut audio_unit = try!(AudioUnit::new(IOType::DefaultOutput));
// Q: What is this type?
let callback = move |args| {
let Args { num_frames, mut data, .. } = args;
for i in 0..num_frames {
let sample = samples.next().unwrap();
for (channel_idx, channel) in data.channels_mut().enumerate() {
channel[i] = sample[channel_idx];
}
}
Ok(())
};
type Args = render_callback::Args<data::NonInterleaved<f32>>;
try!(audio_unit.set_render_callback(callback));
try!(audio_unit.start());
std::thread::sleep(std::time::Duration::from_millis(30000));
Ok(())
}
However, changing it up a little bit to load via a buffer doesn't work as well:
extern crate coreaudio;
use coreaudio::audio_unit::{AudioUnit, IOType};
use coreaudio::audio_unit::render_callback::{self, data};
fn main() {
run().unwrap()
}
fn run() -> Result<(), coreaudio::Error> {
let buf: Vec<[f32; 2]> = vec![[0.0, 0.0]];
let mut samples = buf.iter();
// Construct an Output audio unit that delivers audio to the default output device.
let mut audio_unit = try!(AudioUnit::new(IOType::DefaultOutput));
// Q: What is this type?
let callback = move |args| {
let Args { num_frames, mut data, .. } = args;
for i in 0..num_frames {
let sample = samples.next().unwrap();
for (channel_idx, channel) in data.channels_mut().enumerate() {
channel[i] = sample[channel_idx];
}
}
Ok(())
};
type Args = render_callback::Args<data::NonInterleaved<f32>>;
try!(audio_unit.set_render_callback(callback));
try!(audio_unit.start());
std::thread::sleep(std::time::Duration::from_millis(30000));
Ok(())
}
It says, correctly so, that buf only lives until the end of run and does not live long enough for the audio unit—which makes sense, because "borrowed value must be valid for the static lifetime...".
In any case, that doesn't bother me; I can modify the iterator to load and read from the buffer just fine. However, it does raise some questions:
Why does the Iter { value: 0.0 } have the 'static lifetime?
If it doesn't have the 'static lifetime, why does it say the borrowed value must be valid for the 'static lifetime?
If it does have the 'static lifetime, why? It seems like it would be on the heap and closed on by callback.
I understand that the move keyword allows moving inside the closure, which doesn't help me understand why it interacts with lifetimes. Why can't it move the buffer? Do I have to move both the buffer and the iterator into the closure? How would I do that?
Over all this, how do I figure out the expected lifetime without trying to be a compiler myself? It doesn't seem like guessing and compiling is always a straightforward method to resolving these issues.
Why does the Iter { value: 0.0 } have the 'static lifetime?
It doesn't; only references have lifetimes.
why does it say the borrowed value must be valid for the 'static lifetime
how do I figure out the expected lifetime without trying to be a compiler myself
Read the documentation; it tells you the restriction:
fn set_render_callback<F, D>(&mut self, f: F) -> Result<(), Error>
where
F: FnMut(Args<D>) -> Result<(), ()> + 'static, // <====
D: Data
This restriction means that any references inside of F must live at least as long as the 'static lifetime. Having no references is also acceptable.
All type and lifetime restrictions are expressed at the function boundary — this is a hard rule of Rust.
I understand that the move keyword allows moving inside the closure, which doesn't help me understand why it interacts with lifetimes.
The only thing that the move keyword does is force every variable directly used in the closure to be moved into the closure. Otherwise, the compiler tries to be conservative and move in references/mutable references/values based on the usage inside the closure.
Why can't it move the buffer?
The variable buf is never used inside the closure.
Do I have to move both the buffer and the iterator into the closure? How would I do that?
By creating the iterator inside the closure. Now buf is used inside the closure and will be moved:
let callback = move |args| {
let mut samples = buf.iter();
// ...
}
It doesn't seem like guessing and compiling is always a straightforward method to resolving these issues.
Sometimes it is, and sometimes you have to think about why you believe the code to be correct and why the compiler states it isn't and come to an understanding.

Rc<Trait> to Option<T>?

I'm trying to implement a method that looks like:
fn concretify<T: Any>(rc: Rc<Any>) -> Option<T> {
Rc::try_unwrap(rc).ok().and_then(|trait_object| {
let b: Box<Any> = unimplemented!();
b.downcast().ok().map(|b| *b)
})
}
However, try_unwrap doesn't work on trait objects (which makes sense, as they're unsized). My next thought was to try to find some function that unwraps Rc<Any> into Box<Any> directly. The closest thing I could find would be
if Rc::strong_count(&rc) == 1 {
Some(unsafe {
Box::from_raw(Rc::into_raw(rc))
})
} else {
None
}
However, Rc::into_raw() appears to require that the type contained in the Rc to be Sized, and I'd ideally not like to have to use unsafe blocks.
Is there any way to implement this?
Playground Link, I'm looking for an implementation of rc_to_box here.
Unfortunately, it appears that the API of Rc is lacking the necessary method to be able to get ownership of the wrapped type when it is !Sized.
The only method which may return the interior item of a Rc is Rc::try_unwrap, however it returns Result<T, Rc<T>> which requires that T be Sized.
In order to do what you wish, you would need to have a method with a signature: Rc<T> -> Result<Box<T>, Rc<T>>, which would allow T to be !Sized, and from there you could extract Box<Any> and perform the downcast call.
However, this method is impossible due to how Rc is implemented. Here is a stripped down version of Rc:
struct RcBox<T: ?Sized> {
strong: Cell<usize>,
weak: Cell<usize>,
value: T,
}
pub struct Rc<T: ?Sized> {
ptr: *mut RcBox<T>,
_marker: PhantomData<T>,
}
Therefore, the only Box you can get out of Rc<T> is Box<RcBox<T>>.
Note that the design is severely constrained here:
single-allocation mandates that all 3 elements be in a single struct
T: ?Sized mandates that T be the last field
so there is little room for improvement in general.
However, in your specific case, it is definitely possible to improve on the generic situation. It does, of course, require unsafe code. And while it works fairly well with Rc, implementing it with Arc would be complicated by the potential data-races.
Oh... and the code is provided as is, no warranty implied ;)
use std::any::Any;
use std::{cell, mem, ptr};
use std::rc::Rc;
struct RcBox<T: ?Sized> {
strong: cell::Cell<usize>,
_weak: cell::Cell<usize>,
value: T,
}
fn concretify<T: Any>(rc: Rc<Any>) -> Option<T> {
// Will be responsible for freeing the memory if there is no other weak
// pointer by the end of this function.
let _guard = Rc::downgrade(&rc);
unsafe {
let killer: &RcBox<Any> = {
let killer: *const RcBox<Any> = mem::transmute(rc);
&*killer
};
if killer.strong.get() != 1 { return None; }
// Do not forget to decrement the count if we do take ownership,
// as otherwise memory will not get released.
let result = killer.value.downcast_ref().map(|r| {
killer.strong.set(0);
ptr::read(r as *const T)
});
// Do not forget to destroy the content of the box if we did not
// take ownership
if result.is_none() {
let _: Rc<Any> = mem::transmute(killer as *const RcBox<Any>);
}
result
}
}
fn main() {
let x: Rc<Any> = Rc::new(1);
println!("{:?}", concretify::<i32>(x));
}
I don't think it's possible to implement your concretify function if you're expecting it to move the original value back out of the Rc; see this question for why.
If you're willing to return a clone, it's straightforward:
fn concretify<T: Any+Clone>(rc: Rc<Any>) -> Option<T> {
rc.downcast_ref().map(Clone::clone)
}
Here's a test:
#[derive(Debug,Clone)]
struct Foo(u32);
#[derive(Debug,Clone)]
struct Bar(i32);
fn main() {
let rc_foo: Rc<Any> = Rc::new(Foo(42));
let rc_bar: Rc<Any> = Rc::new(Bar(7));
let foo: Option<Foo> = concretify(rc_foo);
println!("Got back: {:?}", foo);
let bar: Option<Foo> = concretify(rc_bar);
println!("Got back: {:?}", bar);
}
This outputs:
Got back: Some(Foo(42))
Got back: None
Playground
If you want something more "movey", and creating your values is cheap, you could also make a dummy, use downcast_mut() instead of downcast_ref(), and then std::mem::swap with the dummy.

Why does the Rust borrow checker ignore drop()? [duplicate]

I am writing a program that writes to a file and rotates the file it's writing to every now and then. When I check to rotate the file, I can't seem to change the file since it is borrowed by my struct. Even if I drop the instance of the struct, I can't seem to regain ownership of the file to rename it.
Here is my example:
use std::fs::File;
use std::io::{Write};
use std::mem::{drop};
pub struct FileStruct<W: Write> {
pub writer: Option<W>,
}
impl <W: Write> FileStruct<W> {
pub fn new(writer: W) -> FileStruct<W> {
FileStruct {
writer: Some(writer),
}
}
}
fn main() {
let mut file = File::create("tmp.txt").unwrap();
let mut tmp = FileStruct::new(&mut file);
loop {
if true { //will be time based if check
drop(tmp);
drop(file);
file = File::create("tmp2.txt").unwrap();
tmp = FileStruct::new(&mut file);
}
// write to file
}
}
I know I can get this to work by moving the file creation into the new function call of FileStruct instead of having an intermediate variable, file, but I would like to know why this method where I forcibly drop all the variables where all the variables references should be returned doesn't work.
As the std::mem::drop documentation says,
While this does call the argument's implementation of Drop, it will not release any borrows, as borrows are based on lexical scope.
So even if you call drop, file will remain borrowed nonetheless.
Dropping tmp does not "release the borrow" of file because borrowing is lexically scoped. It's "active" as long as the program execution is within the lexical scope that contains tmp even if you drop it. What you intended to do might be possible in the future if/once "non-lexical scopes" are supported. Until then, you can make it work with RefCell:
use std::cell::RefCell;
use std::io::{ self, Write };
/// wraps a reference to a RefCell<W>
struct RefCellWriteRef<'a, W: 'a>(&'a RefCell<W>);
/// implement Write for when W is Write
impl<'a, W: Write + 'a> Write for RefCellWriteRef<'a, W> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
let mut w = self.0.borrow_mut();
w.write(buf)
}
fn flush(&mut self) -> io::Result<()> {
let mut w = self.0.borrow_mut();
w.flush()
}
}
fn main() {
let file: RefCell<Vec<u8>> = RefCell::new(Vec::new());
// use RefCellWriteRef(&file) instead of &mut file
let mut tmp = RefCellWriteRef(&file);
for iter in 0..10 {
if iter == 5 {
drop(tmp);
file.borrow_mut().clear(); // like opening a new file
tmp = RefCellWriteRef(&file);
}
tmp.write(b"foo").unwrap();
}
drop(tmp);
println!("{}", file.borrow().len()); // should print 15
}
The trick here is that given a shared reference to a RefCell<T> you can (eventually) get a &mut T via borrow_mut(). The compile-time borrow checker is pleased because we only use a shared reference on the surface and it's OK to share file like that. Mutable aliasing is avoided by checking at runtime whether the internal T has already been mutably borrowed.

"cannot move out of variable because it is borrowed" when rotating variables

I am writing a program that writes to a file and rotates the file it's writing to every now and then. When I check to rotate the file, I can't seem to change the file since it is borrowed by my struct. Even if I drop the instance of the struct, I can't seem to regain ownership of the file to rename it.
Here is my example:
use std::fs::File;
use std::io::{Write};
use std::mem::{drop};
pub struct FileStruct<W: Write> {
pub writer: Option<W>,
}
impl <W: Write> FileStruct<W> {
pub fn new(writer: W) -> FileStruct<W> {
FileStruct {
writer: Some(writer),
}
}
}
fn main() {
let mut file = File::create("tmp.txt").unwrap();
let mut tmp = FileStruct::new(&mut file);
loop {
if true { //will be time based if check
drop(tmp);
drop(file);
file = File::create("tmp2.txt").unwrap();
tmp = FileStruct::new(&mut file);
}
// write to file
}
}
I know I can get this to work by moving the file creation into the new function call of FileStruct instead of having an intermediate variable, file, but I would like to know why this method where I forcibly drop all the variables where all the variables references should be returned doesn't work.
As the std::mem::drop documentation says,
While this does call the argument's implementation of Drop, it will not release any borrows, as borrows are based on lexical scope.
So even if you call drop, file will remain borrowed nonetheless.
Dropping tmp does not "release the borrow" of file because borrowing is lexically scoped. It's "active" as long as the program execution is within the lexical scope that contains tmp even if you drop it. What you intended to do might be possible in the future if/once "non-lexical scopes" are supported. Until then, you can make it work with RefCell:
use std::cell::RefCell;
use std::io::{ self, Write };
/// wraps a reference to a RefCell<W>
struct RefCellWriteRef<'a, W: 'a>(&'a RefCell<W>);
/// implement Write for when W is Write
impl<'a, W: Write + 'a> Write for RefCellWriteRef<'a, W> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
let mut w = self.0.borrow_mut();
w.write(buf)
}
fn flush(&mut self) -> io::Result<()> {
let mut w = self.0.borrow_mut();
w.flush()
}
}
fn main() {
let file: RefCell<Vec<u8>> = RefCell::new(Vec::new());
// use RefCellWriteRef(&file) instead of &mut file
let mut tmp = RefCellWriteRef(&file);
for iter in 0..10 {
if iter == 5 {
drop(tmp);
file.borrow_mut().clear(); // like opening a new file
tmp = RefCellWriteRef(&file);
}
tmp.write(b"foo").unwrap();
}
drop(tmp);
println!("{}", file.borrow().len()); // should print 15
}
The trick here is that given a shared reference to a RefCell<T> you can (eventually) get a &mut T via borrow_mut(). The compile-time borrow checker is pleased because we only use a shared reference on the surface and it's OK to share file like that. Mutable aliasing is avoided by checking at runtime whether the internal T has already been mutably borrowed.

Resources