Copy reqwest bytes_stream in to tokio file - rust

I'm attempting to copy a file downloaded with reqwest in to a tokio file. This file is too large to store in memory so it needs to be through the bytes_stream() rather than bytes()
I had attempted the following
let mut tmp_file = tokio::fs::File::from(tempfile::tempfile()?);
let byte_stream = reqwest::get(&link).await?.bytes_stream();
tokio::io::copy(&mut byte_stream, &mut tmp_file).await?;
This fails due to
|
153 | tokio::io::copy(&mut byte_stream, &mut tmp_file).await?;
| --------------- ^^^^^^^^^^^^^^^^ the trait `tokio::io::AsyncRead` is not implemented for `impl Stream<Item = Result<bytes::bytes::Bytes, reqwest::Error>>`
| |
| required by a bound introduced by this call
Is there any way I can get the trait AsyncRead on the Stream or otherwise copy this data in to the file? The reason I'm using a tokio file is I later need to AsyncRead from it. Perhaps it would make sense to copy in to a regular std::File and then convert it to a tokio::fs::File?

This method works. Loosely based on the example for bytes_stream()
Note: You probably want to buffer the writes here
use futures::StreamExt;
let mut tmp_file = tokio::fs::File::from(tempfile::tempfile()?);
let mut byte_stream = reqwest::get(&link).await?.bytes_stream();
while let Some(item) = byte_stream.next().await {
tokio::io::copy(&mut item?.as_ref(), &mut tmp_file).await?;
}

Related

POLARS Dataframe innerJOIN in RUST

RUST / POLARS nooby question :)
I can not get the "inner_join" to work:
use polars::prelude::*;
use std::fs::File;
use std::path::PathBuf;
use std::env;
fn main() -> std::io::Result<()> {
let mut root = env::current_dir().unwrap();
let file_1 = root.join("data_1.csv");
let file_2 = root.join("data_2.csv");
// Get data from first file (one column data: column_1)
let file = File::open(file_1).expect("Cannot open file.");
let first_data = CsvReader::new(file)
.has_header(false)
.finish()
.unwrap();
// WORKS !
println!("{}", first_data);
// Get data from second file (one column data: column_1)
let file = File::open(file_2).expect("Cannot open file.");
let second_data = CsvReader::new(file)
.has_header(false)
.finish()
.unwrap();
// WORKS !
println!("{}", second_data);
// Trying to get an INNER join
let all_data = first_data.inner_join(second_data, "column_1", "column_1");
println!("{}", all_data);
Ok(())
}
BUILD OUTPUT:
error[E0277]: `&str` is not an iterator
--> src\main.rs:33:31
|
33 | let all_data = first_data.inner_join(second_data, "column_1", "column_1");
| ^^^^^^^^^^ `&str` is not an iterator; try calling `.chars()` or `.bytes()`
|
= help: the trait `Iterator` is not implemented for `&str`
= note: required because of the requirements on the impl of `IntoIterator` for `&str`
note: required by a bound in `hash_join::<impl DataFrame>::inner_join`
--> C:\Users\rnio\.cargo\registry\src\github.com-1ecc6299db9ec823\polars-core-0.23.1\src\frame\hash_join\mod.rs:645:12
|
645 | I: IntoIterator<Item = S>,
| ^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `hash_join::<impl DataFrame>::inner_join`
Looking for any hint / information what I am missing ... I looked at POLARS Features ... and could not see a flag needed to do JOIN operations ... any ideas ?
Thanks in advance :)
The problem is with the columns, not with the DataFrame.
The inner_join function takes in the DataFrame and two sets of columns that implement IntoIterator. Because you are passing in strings for the column names, it's giving you the error telling you to call .chars() to turn it into an iterator over the characters.
You should be able to get this to work with the following:
let all_data = first_data.inner_join(&second_data, ["column_1"], ["column_1"]);
You can see the definition of this function here: https://docs.rs/polars/latest/polars/frame/struct.DataFrame.html#method.inner_join

Rust Vulkano how to reuse command buffer

I've precomputed one command buffer per frame and now I'd like to submit them in loop. I have this piece of code so far
let command_buffer = &command_buffers[image_num];
let future = previous_frame_end
.take()
.unwrap()
.join(acquire_future)
.then_execute(queue.clone(), command_buffer)
.unwrap()
.then_swapchain_present(queue.clone(), swapchain.clone(), image_num)
.then_signal_fence_and_flush();
However this yields
let command_buffer = &command_buffers[image_num];
| ^^^^^^^^^^^^^^^
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that the type `&PrimaryAutoCommandBuffer` will meet its required lifetime bounds
--> src\main.rs:442:22
|
442 | .then_execute(queue.clone(), command_buffer)
I can see that the function expects static lifetime
fn then_execute<Cb>(
self,
queue: Arc<Queue>,
command_buffer: Cb,
) -> Result<CommandBufferExecFuture<Self, Cb>, CommandBufferExecError>
where
Self: Sized,
Cb: PrimaryCommandBuffer + 'static,
Perhaps I should clone this buffer similarly to the way it's done here
https://github.com/bwasty/vulkan-tutorial-rs/blob/f2f6935914bec4e79937b8cac415683c9fbadcb1/src/bin/15_hello_triangle.rs#L523
However, In version vulkano = "0.24.0" cloning is no longer implemented for PrimaryCommandBuffer

Why does serde_json::to_writer not require its argument to be `mut`?

So I've seen this question which explains how serde_json can both take Readers/Writers by reference or alternatively take ownership of them. Fair enough.
But I don't get how this works for Write - all of Write methods require a &mut self, so I'd think that if I pass a method that only knows its argument is a reference to something that is Write it can't do anything with it. But this example compiles and works just fine, even though I'm passing a non-mut ref to a method that, one way or another, ends up writing to the referenced file:
extern crate serde_json;
use std::fs::OpenOptions;
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
let file = OpenOptions::new()
.create(true)
.write(true)
.truncate(true)
.open("/tmp/serde.json")?;
// why does this work?
serde_json::to_writer(&file, &10)?;
Ok(())
}
I'm passing a &File - as expected, if I were to directly call any of Write's methods on a File it doesn't work:
use std::io::{self, Write};
use std::fs::OpenOptions;
fn main() -> io::Result<()> {
let file = OpenOptions::new()
.create(true)
.write(true)
.truncate(true)
.open("/tmp/wtf")?;
let file_ref = &file;
// this complains about not having a mutable ref as expected
file_ref.write(&[1,2,3])?;
Ok(())
}
error[E0596]: cannot borrow `file_ref` as mutable, as it is not declared as mutable
--> test.rs:12:5
|
10 | let file_ref = &file;
| -------- help: consider changing this to be mutable: `mut file_ref`
11 | // this complains about not having a mutable ref as expected
12 | file_ref.write(&[1,2,3])?;
| ^^^^^^^^ cannot borrow as mutable
So what gives? Is serde_json breaking the type system somehow, or is this an intentional feature of the type system? If it's the latter, how does it work and why does it work like that?
serde_json::to_writer's first parameter accepts any type that implements Write. One such value is &File.
This may be surprising, but the docs for File explicitly state that the mutability of the reference to a file has no bearing on whether the file will change or not:
Note that, although read and write methods require a &mut File, because of the interfaces for Read and Write, the holder of a &File can still modify the file, either through methods that take &File or by retrieving the underlying OS object and modifying the file that way. Additionally, many operating systems allow concurrent modification of files by different processes. Avoid assuming that holding a &File means that the file will not change.
This might have you asking - wait, I thought the methods on Write took &mut self? And they do! But in this case, Write is implemented for &File, so the type passed to Write is, somewhat confusingly, &mut &File (a mutable reference to an immutable reference).
This explains why your second example doesn't compile - you need to be able to take a mutable reference to &file, and you can't do that without the binding being mutable. This is an important thing to realize - the mutablilty of a binding and the mutability of the bound value are not necessarily the same.
This is hinted at in the error message when you run your code:
error[E0596]: cannot borrow `file_ref` as mutable, as it is not declared as mutable
--> src/lib.rs:11:5
|
10 | let file_ref = &file;
| -------- help: consider changing this to be mutable: `mut file_ref`
11 | file_ref.write(&[1,2,3])?;
| ^^^^^^^^ cannot borrow as mutable
error: aborting due to previous error
If you change let file_ref = &file; to let mut file_ref = &file;, your code compiles.

Why can't I collect the Lines iterator into a vector of Strings?

I'm trying to read the lines of a text file into a vector of Strings so I can continually loop over them and write each line to a channel for testing, but the compiler complains about collect:
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
fn main() {
let file = File::open(Path::new("file")).unwrap();
let reader = BufReader::new(&file);
let _: Vec<String> = reader.lines().collect().unwrap();
}
The compiler complains:
error[E0282]: type annotations needed
--> src/main.rs:9:30
|
9 | let lines: Vec<String> = reader.lines().collect().unwrap();
| ^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type for `B`
|
= note: type must be known at this point
Without the .unwrap(), compiler says:
error[E0277]: a collection of type `std::vec::Vec<std::string::String>` cannot be built from an iterator over elements of type `std::result::Result<std::string::String, std::io::Error>`
--> src/main.rs:9:45
|
9 | let lines: Vec<String> = reader.lines().collect();
| ^^^^^^^ a collection of type `std::vec::Vec<std::string::String>` cannot be built from `std::iter::Iterator<Item=std::result::Result<std::string::String, std::io::Error>>`
|
= help: the trait `std::iter::FromIterator<std::result::Result<std::string::String, std::io::Error>>` is not implemented for `std::vec::Vec<std::string::String>`
How do I tell Rust the correct type?
Since you want to collect straight into a Vec<String> while the Lines iterator is over Result<String, std::io::Error>, you need to help the type inference a little bit:
let lines: Vec<String> = reader.lines().collect::<Result<_, _>>().unwrap();
or even just:
let lines: Vec<_> = reader.lines().collect::<Result<_, _>>().unwrap();
This way the compiler knows that there is an intermediate step with a Result<Vec<String>, io::Error>. I think this case could be improved in the future, but for now the type inference is not able to deduce this.

Reading a file with Rust - borrowed value only lives until here

I have a function that should read a file and returns it's contents.
fn read (file_name: &str) -> &str {
let mut f = File::open(file_name)
.expect(&format!("file not found: {}", file_name));
let mut contents = String::new();
f.read_to_string(&mut contents)
.expect(&format!("cannot read file {}", file_name));
return &contents;
}
But I get this error:
--> src\main.rs:20:13
|
20 | return &contents;
| ^^^^^^^^ borrowed value does not live long enough
21 | }
| - borrowed value only lives until here
|
What am I doing wrong?
My Idea of what is happening here is this:
let mut f = File::open(file_name).expect(....); - this takes a handle of a file and tells the OS that we want to do things with it.
let mut contents = String::new(); - this creates a vector-like data structure on the heap in order to store the data that we are about to read from the file.
f.read_to_string(&mut contents).expect(...); - this reads the file into the contents space.
return &contents; - this returns a pointer to the vector where the file data is stored.
Why am I not able to return the pointer that I want?
How do I close my file (the f variable)? I think that rust will close it for me after the variable goes out of scope, but what If I need to close it before that?
You are correct about the file handle being closed automatically when its variable goes out of scope; the same will happen to contents, though - it will be destroyed at the end of the function, unless you decide to return it as an owned String. In Rust functions can't return references to objects created inside them, only to those passed to them as arguments.
You can fix your function as follows:
fn read(file_name: &str) -> String {
let mut f = File::open(file_name)
.expect(&format!("file not found: {}", file_name));
let mut contents = String::new();
f.read_to_string(&mut contents)
.expect(&format!("cannot read file {}", file_name));
contents
}
Alternatively, you can pass contents as a mutable reference to the read function:
fn read(file_name: &str, contents: &mut String) { ... }

Resources