How to write an asynchronous recursive walkdir function with an asynchronous callback - rust

I'm trying to write an async function that will traverse the filesystem tree, recursively, and calls an asynchronous callback for each file found.
This is for a learning effort, I have no real use case.
Here is what I have so far:
use async_std::{
fs::{self, *},
path::*,
prelude::*,
}; // 1.5.0, features = ["unstable"]
use futures::{
executor::block_on,
future::{BoxFuture, FutureExt},
}; // 0.3.4
use std::{marker::Sync, pin::Pin};
fn main() {
fn walkdir<F>(path: String, cb: &'static F) -> BoxFuture<'static, ()>
where
F: Fn(&DirEntry) -> BoxFuture<()> + Sync + Send,
{
async move {
let mut entries = fs::read_dir(&path).await.unwrap();
while let Some(path) = entries.next().await {
let entry = path.unwrap();
let path = entry.path().to_str().unwrap().to_string();
if entry.path().is_file().await {
cb(&entry).await
} else {
walkdir(path, cb).await
}
}
}
.boxed()
}
let foo = async {
walkdir(".".to_string(), &|entry: &DirEntry| async {
async_std::println!(">> {}\n", &entry.path().to_str().unwrap()).await
})
.await
};
block_on(foo);
}
I get this far by some sort of trial and error, but now I'm stuck on async closure callback with this error
warning: unused import: `path::*`
--> src/main.rs:3:5
|
3 | path::*,
| ^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
warning: unused import: `pin::Pin`
--> src/main.rs:10:25
|
10 | use std::{marker::Sync, pin::Pin};
| ^^^^^^^^
error[E0308]: mismatched types
--> src/main.rs:33:54
|
33 | walkdir(".".to_string(), &|entry: &DirEntry| async {
| ______________________________________________________^
34 | | async_std::println!(">> {}\n", &entry.path().to_str().unwrap()).await
35 | | })
| |_________^ expected struct `std::pin::Pin`, found opaque type
|
= note: expected struct `std::pin::Pin<std::boxed::Box<dyn core::future::future::Future<Output = ()> + std::marker::Send>>`
found opaque type `impl core::future::future::Future`

use async_std::{
fs::{self, *},
path::*,
prelude::*,
}; // 1.5.0
use futures::{future::{Future, FutureExt, LocalBoxFuture}, executor}; // 0.3.4
fn main() {
async fn walkdir<R>(path: impl AsRef<Path>, mut cb: impl FnMut(DirEntry) -> R)
where
R: Future<Output = ()>,
{
fn walkdir_inner<'a, R>(path: &'a Path, cb: &'a mut dyn FnMut(DirEntry) -> R) -> LocalBoxFuture<'a, ()>
where
R: Future<Output = ()>,
{
async move {
let mut entries = fs::read_dir(path).await.unwrap();
while let Some(path) = entries.next().await {
let entry = path.unwrap();
let path = entry.path();
if path.is_file().await {
cb(entry).await
} else {
walkdir_inner(&path, cb).await
}
}
}.boxed_local()
}
walkdir_inner(path.as_ref(), &mut cb).await
}
executor::block_on({
walkdir(".", |entry| async move {
async_std::println!(">> {}", entry.path().display()).await
})
});
}
Notable changes:
Take in AsRef<Path> instead of a String and a generic closure instead of a trait object reference
Change the closure type to be FnMut as it's more permissive
The closure returns any type that is a future.
There's an inner implementation function that hides the ugly API required for recursive async functions.
The callback takes the DirEntry by value instead of by reference.
See also:
How to asynchronously explore a directory and its sub-directories?
How to using async fn callback in rust

Related

What signature can I use to download files using Axum and Tokio?

I'm using axum and this code (found here) to download files:
use axum::{
body::StreamBody,
http::{header, StatusCode},
response::{Headers, IntoResponse},
routing::get,
Router,
};
use std::net::SocketAddr;
use tokio_util::io::ReaderStream;
#[tokio::main]
async fn main() {
let app = Router::new().route("/", get(handler));
let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
axum::Server::bind(&addr)
.serve(app.into_make_service())
.await
.unwrap();
}
async fn handler() -> impl IntoResponse {
// `File` implements `AsyncRead`
let file = match tokio::fs::File::open("Cargo.toml").await {
Ok(file) => file,
Err(err) => return Err((StatusCode::NOT_FOUND, format!("File not found: {}", err))),
};
// convert the `AsyncRead` into a `Stream`
let stream = ReaderStream::new(file);
// convert the `Stream` into an `axum::body::HttpBody`
let body = StreamBody::new(stream);
let headers = Headers([
(header::CONTENT_TYPE, "text/toml; charset=utf-8"),
]);
Ok((headers, body))
}
Everything works. But I cannot find a way to move the below code in a separate function:
let file = match tokio::fs::File::open("Cargo.toml").await {
Ok(file) => file,
Err(err) => return Err((StatusCode::NOT_FOUND, format!("File not found: {}", err))),
};
I would like to use both tokio::fs::File and https://crates.io/crates/rust-s3 methods in this function.
So I need a "common type" which appear to be AsyncRead, I think.
What should be the signature of the function?
I tried with:
use tokio::io::AsyncRead;
pub struct Player {
db: Arc<DB>
}
impl Handler {
pub async fn player_pdf(
&self,
id: &str,
) -> Result<&(dyn AsyncRead)> {
//...use id here...
let file = &tokio::fs::File::open("player.pdf").await?;
Ok(file)
}
}
but I get the error:
error[E0308]: mismatched types
|
55 | Ok(file)
| -- ^^^^
| | |
| | expected reference, found struct `tokio::fs::File`
| | help: consider borrowing here: `&file`
| arguments to this enum variant are incorrect
|
= note: expected reference `&dyn tokio::io::AsyncRead`
found struct `tokio::fs::File`
I tried with: let file = &tokio::fs::File::open("player.pdf").await?; and I got:
error[E0515]: cannot return value referencing temporary value
|
43 | let file = &tokio::fs::File::open(...
| --------------------------- temporary value created here
...
55 | Ok(file)
| ^^^^^^^^ returns a value referencing data owned by the current function
What can I use?
Returning a generic "boxed" value might be the solution here:
impl Handler {
pub async fn player_pdf(
&self,
id: &str,
) -> Result<Box<dyn AsyncRead>> {
//...use id here...
Ok(Box::new(tokio::fs::File::open("player.pdf").await?))
}
}
Where now there's no dangling reference, it's encapsulated and fully owned.

Clashing types, crossterm::Result and core::Result error[E0107]:

I know the issue is that I have two Result types from different libraries but can't find how to fix it.
[dependencies]
crossterm = "0.23"
time = "0.3.9"
tokio = { version = "1", features = ["full"] }
reqwest = { version = "0.11", features = ["blocking", "json"] }
use time::Instant;
use std::collections::HashMap;
use crossterm::{
event::{self, Event, KeyCode, KeyEvent},
Result,
};
pub fn read_char() -> Result<char> {
loop {
if let Event::Key(KeyEvent {
code: KeyCode::Char(c),
..
}) = event::read()?
{
return Ok(c);
}
}
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
let instant = Instant::now();
let response = reqwest::blocking::get("https://httpbin.org/ip")?
.json::<HashMap<String, String>>()?;
let duration = instant.elapsed();
println!("ns = {:?}, response: {:#?}, ", duration.whole_nanoseconds(), response);
// Any key to continue
println!("Press any key to continue:");
println!("{:?}", read_char());
Ok(())
}
Gives the error:
error[E0107]: this type alias takes 1 generic argument but 2 generic arguments were supplied
--> src\main.rs:20:14
|
20 | fn main() -> Result<(), Box<dyn std::error::Error>> {
| ^^^^^^ -------------------------- help: remove this generic argument
| |
| expected 1 generic argument
How do I fix this? I have searched but am likely looking for incorrect terms e.g. namespace alias and core::Result error[E0107] is not really helping.
I have tried this without success:
fn main() -> core::Result<(), Box<dyn std::error::Error>> {
You have crossterm ::Result in scope, so you would have to disambiguate the result you want to return, otherwise it just thinks you want to return the crossterm type:
fn main() -> std::result::Result<(), Box<dyn std::error::Error>> {
...
Ok(())
}

Rust "this parameter and the return type are declared with different lifetimes"

I'm using the smol library from Rust. None of the other answers to this question helped.
The smol's Executor::spawn() is declared like so:
pub fn spawn<T: Send + 'a>(&self, future: impl Future<Output = T> + Send + 'a) -> Task<T> {
Now I have a function and want to call spawn recursively like so:
async fn start(executor: &Executor<'_>) {
let server_task = executor.spawn(async {
executor.spawn(async { println!("hello"); }).await;
});
}
But I'm getting this error:
9 | async fn start(executor: &Executor<'_>) {
| ------------ -
| |
| this parameter and the return type are declared with different lifetimes...
...
18 | let server_task = executor.spawn(async {
| ^^^^^ ...but data from `executor` is returned here
How can I resolve this error? I'm very confused.
use {
smol::{block_on, Executor},
std::sync::Arc,
};
// --
fn main() {
let ex = Arc::new(Executor::new());
block_on(ex.run(start(ex.clone())));
}
async fn start(executor: Arc<Executor<'_>>) {
let ex2 = executor.clone();
let server_task = executor.spawn(async move {
let t = ex2.spawn(async {
println!("hello");
});
t.await;
});
server_task.await;
}

Can't get async closure to work with Warp::Filter

I am trying to get an async closure working in the and_then filter from Warp.
This is the smallest example I could come up with where I am reasonably sure I didn't leave any important details out:
use std::{convert::Infallible, sync::Arc, thread, time};
use tokio::sync::RwLock;
use warp::Filter;
fn main() {
let man = Manifest::new();
let check = warp::path("updates").and_then(|| async move { GetAvailableBinaries(&man).await });
}
async fn GetAvailableBinaries(man: &Manifest) -> Result<impl warp::Reply, Infallible> {
Ok(warp::reply::json(&man.GetAvailableBinaries().await))
}
pub struct Manifest {
binaries: Arc<RwLock<Vec<i32>>>,
}
impl Manifest {
pub fn new() -> Manifest {
let bins = Arc::new(RwLock::new(Vec::new()));
thread::spawn(move || async move {
loop {
thread::sleep(time::Duration::from_millis(10000));
}
});
Manifest { binaries: bins }
}
pub async fn GetAvailableBinaries(&self) -> Vec<i32> {
self.binaries.read().await.to_vec()
}
}
I am using:
[dependencies]
tokio = { version = "0.2", features = ["full"] }
warp = { version = "0.2", features = ["tls"] }
The error is:
error[E0525]: expected a closure that implements the `Fn` trait, but this closure only implements `FnOnce`
--> src/main.rs:9:48
|
9 | let check = warp::path("updates").and_then(|| async move { GetAvailableBinaries(&man).await });
| -------- ^^^^^^^^^^^^^ ------------------------------------ closure is `FnOnce` because it moves the variable `man` out of its environment
| | |
| | this closure implements `FnOnce`, not `Fn`
| the requirement to implement `Fn` derives from here
After making Manifest implement Clone, you can fix the error by balancing when the manifest object is cloned:
fn main() {
let man = Manifest::new();
let check = warp::path("updates").and_then(move || {
let man = man.clone();
async move { get_available_binaries(&man).await }
});
warp::serve(check);
}
This moves man into the closure passed to and_then, then provides a clone of man to the async block each time the closure is executed. The async block then owns that data and can take a reference to it without worrying about executing the future after the data has been deallocated.
I'm not sure this is what you're going for, but this solution builds for me:
use std::{convert::Infallible, sync::Arc, thread, time};
use tokio::sync::RwLock;
use warp::Filter;
fn main() {
let man = Manifest::new();
let check = warp::path("updates").and_then(|| async { GetAvailableBinaries(&man).await });
}
async fn GetAvailableBinaries(man: &Manifest) -> Result<impl warp::Reply, Infallible> {
Ok(warp::reply::json(&man.GetAvailableBinaries().await))
}
#[derive(Clone)]
pub struct Manifest {
binaries: Arc<RwLock<Vec<i32>>>,
}
impl Manifest {
pub fn new() -> Manifest {
let bins = Arc::new(RwLock::new(Vec::new()));
thread::spawn(move || async {
loop {
thread::sleep(time::Duration::from_millis(10000));
//mutate bins here
}
});
Manifest { binaries: bins }
}
pub async fn GetAvailableBinaries(&self) -> Vec<i32> {
self.binaries.read().await.to_vec()
}
}
The move here is the reason the compiler gave a warning regarding the signature: let check = warp::path("updates").and_then(|| async move { GetAvailableBinaries(&man).await });. This means that everything referenced in this closure will be moved into the context of the closure. In this case, the compiler can't guarantee the closure to be Fn but only FnOnce meaning that the closure can only be guaranteed to execute once.

How do I use PickleDB with Rocket/Juniper Context?

I'm trying to write a Rocket / Juniper / Rust based GraphQL Server using PickleDB - an in-memory key/value store.
The pickle db is created / loaded at the start and given to rocket to manage:
fn rocket() -> Rocket {
let pickle_path = var_os(String::from("PICKLE_PATH")).unwrap_or(OsString::from("pickle.db"));
let pickle_db_dump_policy = PickleDbDumpPolicy::PeriodicDump(Duration::from_secs(120));
let pickle_serialization_method = SerializationMethod::Bin;
let pickle_db: PickleDb = match Path::new(&pickle_path).exists() {
false => PickleDb::new(pickle_path, pickle_db_dump_policy, pickle_serialization_method),
true => PickleDb::load(pickle_path, pickle_db_dump_policy, pickle_serialization_method).unwrap(),
};
rocket::ignite()
.manage(Schema::new(Query, Mutation))
.manage(pickle_db)
.mount(
"/",
routes![graphiql, get_graphql_handler, post_graphql_handler],
)
}
And I want to retrieve the PickleDb instance from the Rocket State in my Guard:
pub struct Context {
pickle_db: PickleDb,
}
impl juniper::Context for Context {}
impl<'a, 'r> FromRequest<'a, 'r> for Context {
type Error = ();
fn from_request(_request: &'a Request<'r>) -> request::Outcome<Context, ()> {
let pickle_db = _request.guard::<State<PickleDb>>()?.inner();
Outcome::Success(Context { pickle_db })
}
}
This does not work because the State only gives me a reference:
26 | Outcome::Success(Context { pickle_db })
| ^^^^^^^^^ expected struct `pickledb::pickledb::PickleDb`, found `&pickledb::pickledb::PickleDb`
When I change my Context struct to contain a reference I get lifetime issues which I'm not yet familiar with:
15 | pickle_db: &PickleDb,
| ^ expected named lifetime parameter
I tried using 'static which does make rust quite unhappy and I tried to use the request lifetime (?) 'r of the FromRequest, but that does not really work either...
How do I get this to work? As I'm quite new in rust, is this the right way to do things?
I finally have a solution, although the need for unsafe indicates it is sub-optimal :)
#![allow(unsafe_code)]
use pickledb::{PickleDb, PickleDbDumpPolicy, SerializationMethod};
use serde::de::DeserializeOwned;
use serde::Serialize;
use std::env;
use std::path::Path;
use std::time::Duration;
pub static mut PICKLE_DB: Option<PickleDb> = None;
pub fn cache_init() {
let pickle_path = env::var(String::from("PICKLE_PATH")).unwrap_or(String::from("pickle.db"));
let pickle_db_dump_policy = PickleDbDumpPolicy::PeriodicDump(Duration::from_secs(120));
let pickle_serialization_method = SerializationMethod::Json;
let pickle_db = match Path::new(&pickle_path).exists() {
false => PickleDb::new(
pickle_path,
pickle_db_dump_policy,
pickle_serialization_method,
),
true => PickleDb::load(
pickle_path,
pickle_db_dump_policy,
pickle_serialization_method,
)
.unwrap(),
};
unsafe {
PICKLE_DB = Some(pickle_db);
}
}
pub fn cache_get<V>(key: &str) -> Option<V>
where
V: DeserializeOwned + std::fmt::Debug,
{
unsafe {
let pickle_db = PICKLE_DB
.as_ref()
.expect("cache uninitialized - call cache_init()");
pickle_db.get::<V>(key)
}
}
pub fn cache_set<V>(key: &str, value: &V) -> Result<(), pickledb::error::Error>
where
V: Serialize,
{
unsafe {
let pickle_db = PICKLE_DB
.as_mut()
.expect("cache uninitialized - call cache_init()");
pickle_db.set::<V>(key, value)?;
Ok(())
}
}
This can be simply imported and used as expected, but I think I'll run into issues when the load gets to high...

Resources