I'm a Java developer and in the process of learning Rust. Here is a tiny example that reads the file names of the current directory into a string list/vector and then outputs it.
Java with nio:
import java.io.*;
import java.nio.file.*;
import java.util.*;
import java.util.stream.*;
public class ListFileNames {
public static void main(String[] args) {
final Path dir = Paths.get(".");
final List<String> fileNames = new ArrayList<>();
try {
final Stream<Path> dirEntries = Files.list(dir);
dirEntries.forEach(path -> {
if (Files.isRegularFile(path)) {
fileNames.add(path.getFileName().toString());
}
});
}
catch (IOException ex) {
System.err.println("Failed to read " + dir + ": " + ex.getMessage());
}
print(fileNames);
}
private static void print(List<String> names) {
for (String name : names) {
System.out.println(name);
}
}
}
Here is what I came up with in Rust:
use std::fs;
use std::path::Path;
fn main() {
let dir = Path::new(".");
let mut entries: Vec<String> = Vec::new();
let dir_name = dir.to_str().unwrap();
let dir_content = fs::read_dir(dir);
match dir_content {
Ok(dir_content) => {
for entry in dir_content {
if let Ok(dir_entry) = entry {
if let Ok(file_type) = dir_entry.file_type() {
if file_type.is_file() {
if let Some(string) = dir_entry.file_name().to_str() {
entries.push(String::from(string));
}
}
}
}
}
}
Err(error) => println!("failed to read {}: {}", dir_name, error)
}
print(entries);
}
fn print(names: Vec<String>) {
for name in &names {
println!("{}", name);
}
}
Are there means to reduce the excessive indentations in the Rust code making it more easier to read similar to the Java code?
It is possible to use ? to short-circuit execution, as demonstrated by #Bazaim, although this has slightly different semantics as it stops on the first error, rather than ignoring it.
To keep true to your semantics, you would move to a Stream-based Iterator-based approach as can be seen here:
use std::error::Error;
use std::fs;
use std::path::Path;
fn main() -> Result<(), Box<dyn Error>> {
let dir = Path::new(".");
let dir_content = fs::read_dir(dir)?;
dir_content.into_iter()
.filter_map(|entry| entry.ok())
.filter(|entry| if let Ok(t) = entry.file_type() { t.is_file() } else { false })
.filter_map(|entry| entry.file_name().into_string().ok())
.for_each(|file_name| println!("{}", file_name));
Ok(())
}
I am not quite happy with that filter step, but I do like the fact that the functional-style API cleanly separates each step.
I'm also a beginner at Rust.
The ? operator is very usefull.
Here is the smallest version I get :
use std::error::Error;
use std::fs;
use std::path::Path;
fn main() -> Result<(), Box<dyn Error>> {
let dir = Path::new(".");
let mut entries: Vec<String> = Vec::new();
let dir_content = fs::read_dir(dir)?;
for entry in dir_content {
let entry = entry?;
if entry.file_type()?.is_file() {
if let Some(string) = entry.file_name().to_str() {
entries.push(String::from(string));
}
}
}
print(entries);
Ok(())
}
fn print(names: Vec<String>) {
for name in &names {
println!("{}", name);
}
}
Slightly modified version of #Bazaim's answer.
So, you can write the fallible code in a separate function and pass it reference to the vector.
If it fails, you can print the error and still print the names from the vector you passed in.
use std::error::Error;
use std::fs;
use std::path::Path;
fn main() {
let mut names = vec![];
if let Err(e) = collect(&mut names) {
println!("Error: {}", e);
}
for name in names {
println!("{}", name);
}
}
fn collect(names: &mut Vec<String>) -> Result<(), Box<dyn Error>> {
let dir = Path::new(".");
let dir_content = fs::read_dir(dir)?;
for entry in dir_content {
let entry = entry?;
if entry.file_type()?.is_file() {
if let Some(string) = entry.file_name().to_str() {
names.push(String::from(string));
}
}
}
Ok(())
}
Related
in order to learn Rust, I try to create small snippets to apply what we learn in the Rust book and implement good practices.
Have a small function to list content of a repository :
use std::{io, fs, path::PathBuf, path::Path};
pub fn get_directory_content(path: &str) -> Result<Vec<PathBuf>, io::Error> {
let _path: bool = Path::new(path).is_dir();
match _path {
true => {
let mut result = vec![];
for file in fs::read_dir(path).unwrap() {
result.push(file.unwrap().path());
}
Ok(result)
},
false => Err(io::Error::new(io::ErrorKind::Other, " is not a directory")),
}
}
my goal is to be able to catch the error if the folder does not exist without triggering a panic.
in main.rs :
mod utils;
fn main() {
let directory = "./qsdsqd";
let test = utils::get_directory_content(directory).unwrap();
println!("{:?}", a);
}
if directory exist : ok, unwrap is happy. But does anyone know a "trick" for get the content of the error in var test ? Also, can we put the name of a variable in io::ErrorKind::Other to get more precision (here : &path) ?
Next try
fn main() {
let directory = "./qsdqsd";
let a = match utils::get_directory_content(directory){
Err(e) => println!("an error: {:?}", e),
Ok(c) => println!("{:?}", c),
};
println!("{:?}", a);
}
When error, ok, we have message, but here, if we put a correct folder : a "just" print result but content is empty, and we can't say Ok(c) => c for just return Ok content from function :/
Have a small function to list content of a repository :
That's already a pretty bad start, because it combines a TOCTOU with unnecessary extra work: if you're checking is_dir then trying to read the directory, it's possible for the entry to get deleted or swapped from under you.
This is a shame, since read_dir already does exactly what you want:
pub fn get_directory_content(path: &str) -> Result<Vec<PathBuf>, io::Error> {
let mut result = vec![];
for file in fs::read_dir(path)? {
result.push(file.unwrap().path());
}
Ok(result)
}
And you can apply this to the individual entries as well:
pub fn get_directory_content(path: &str) -> Result<Vec<PathBuf>, io::Error> {
let mut result = vec![];
for file in fs::read_dir(path)? {
result.push(file?.path());
}
Ok(result)
}
When error, ok, we have message, but here, if we put a correct folder : a "just" print result but content is empty, and we can't say Ok(c) => c for just return Ok content from function :/
Sure you can, however you still have to do something for the Err case: as most things in Rust, match is an expression, so all the branches need to return values of the same type... or not return at all:
let a = match get_directory_content(directory) {
Err(e) => {
println!("an error: {:?}", e);
return;
}
Ok(c) => c,
};
return has type !, which is Rust's "bottom" type: it's compatible with everything, because return does not "terminate", and thus there's npo reason for it to be incompatible with anything.
Alternatively, you could update main to return a Result as well, though that also requires updating it to return a value:
fn main() -> Result<(), io::Error> {
let directory = "./.config";
let a = get_directory_content(directory)?;
println!("{:?}", a);
Ok(())
}
You need to return c from your match statement.
Further, you need to do something in the Err case other than just print. What should a be in the error case?
I assume that you simply want to end the program, so I inserted a return.
mod utils {
use std::{fs, io, path::Path, path::PathBuf};
pub fn get_directory_content(path: &str) -> Result<Vec<PathBuf>, io::Error> {
let _path: bool = Path::new(path).is_dir();
match _path {
true => {
let mut result = vec![];
for file in fs::read_dir(path).unwrap() {
result.push(file.unwrap().path());
}
Ok(result)
}
false => Err(io::Error::new(io::ErrorKind::Other, " is not a directory")),
}
}
}
fn main() {
let directory = "./qsdqsd";
let a = match utils::get_directory_content(directory) {
Err(e) => {
println!("an error: {:?}", e);
return;
}
Ok(c) => {
println!("{:?}", c);
c
}
};
println!("{:?}", a);
}
["./qsdqsd/a.txt"]
["./qsdqsd/a.txt"]
DISCLAIMER: My answer is very much superficial. #Masklinn goes into much more detail about the "cleanest way" and other issues with the given code.
Because this is the accepted answer (at the time of writing), here is how a "cleanest way" version of the code could look like:
use std::{fs, io, path::PathBuf};
pub fn get_directory_content(path: &str) -> Result<Vec<PathBuf>, io::Error> {
let mut result = vec![];
for file in fs::read_dir(path)? {
result.push(file?.path());
}
Ok(result)
}
fn main() {
let directory = "./qsdqsd2";
let a = match get_directory_content(directory) {
Err(e) => {
println!("an error: {:?}", e);
return;
}
Ok(c) => c,
};
println!("{:?}", a);
}
["./qsdqsd/a.txt"]
Alternatively, you could have main() return a Result, which makes this even cleaner:
use std::{fs, io, path::PathBuf};
pub fn get_directory_content(path: &str) -> Result<Vec<PathBuf>, io::Error> {
let mut result = vec![];
for file in fs::read_dir(path)? {
result.push(file?.path());
}
Ok(result)
}
fn main() -> Result<(), io::Error> {
let directory = "./qsdqsd";
let a = get_directory_content(directory)?;
println!("{:?}", a);
Ok(())
}
I'm trying to figure out how to access the EnumValueOption len associated with each member of Foo:
// proto/demo.proto:
syntax = "proto3";
import "google/protobuf/descriptor.proto";
package demo;
extend google.protobuf.EnumValueOptions {
optional uint32 len = 50000;
}
enum Foo {
None = 0 [(len) = 10];
One = 1 [(len) = 20];
Two = 2 [(len) = 30];
}
I think I should be able to do this through prost_types::FileDescriptorSet by collecting the reflection information at build time:
// build.rs:
use std::path::PathBuf;
fn main() {
let out_dir =
PathBuf::from(std::env::var("OUT_DIR").expect("OUT_DIR environment variable not set."));
prost_build::Config::new()
.file_descriptor_set_path(out_dir.join("file_descriptor_set.bin"))
.compile_protos(&["proto/demo.proto"], &["proto"])
.unwrap_or_else(|e| panic!("Failed to compile protos {:?}", e));
}
But I can't figure out how to actually retrieve the len field:
// src/lib.rs
use prost::Message;
use prost_types::FileDescriptorSet;
use prost_types::EnumValueOptions;
include!(concat!(env!("OUT_DIR"), concat!("/demo.rs")));
pub fn parse_file_descriptor_set() -> FileDescriptorSet {
let file_descriptor_set_bytes =
include_bytes!(concat!(env!("OUT_DIR"), "/file_descriptor_set.bin"));
prost_types::FileDescriptorSet::decode(&file_descriptor_set_bytes[..]).unwrap()
}
#[cfg(test)]
mod tests {
use super::*;
fn get_len(foo: Foo) -> u32 {
let set = parse_file_descriptor_set();
for f in &set.file {
for ext in &f.extension {
dbg!(ext);
}
for e in &f.enum_type {
dbg!(e);
for v in &e.value {
dbg!(v);
}
}
}
todo!()
}
#[test]
fn test_get_len() {
assert_eq!(get_len(Foo::One), 20);
}
}
Am I on the right track? Is this something that's even supported? I'm using prost, prost-types, and prost-build version 0.9.
I am trying to write some code that could be able to write a method that returns me a Vec with the names of the fields of a struct.
Code snippet below:
# Don't forget about dependencies if you try to reproduce this on local
use proc_macro2::{Span, Ident};
use quote::quote;
use syn::{
punctuated::Punctuated, token::Comma, Attribute, DeriveInput, Fields, Meta, NestedMeta,
Variant, Visibility,
};
#[proc_macro_derive(StructFieldNames, attributes(struct_field_names))]
pub fn derive_field_names(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
let ast: DeriveInput = syn::parse(input).unwrap();
let (vis, ty, generics) = (&ast.vis, &ast.ident, &ast.generics);
let names_struct_ident = Ident::new(&(ty.to_string() + "FieldStaticStr"), Span::call_site());
let fields = filter_fields(match ast.data {
syn::Data::Struct(ref s) => &s.fields,
_ => panic!("FieldNames can only be derived for structs"),
});
let names_struct_fields = fields.iter().map(|(vis, ident)| {
quote! {
#vis #ident: &'static str
}
});
let mut vec_fields: Vec<String> = Vec::new();
let names_const_fields = fields.iter().map(|(_vis, ident)| {
let ident_name = ident.to_string();
vec_fields.push(ident_name);
quote! {
#vis #ident: -
}
});
let names_const_fields_as_vec = fields.iter().map(|(_vis, ident)| {
let ident_name = ident.to_string();
// vec_fields.push(ident_name)
});
let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();
let tokens = quote! {
#[derive(Debug)]
#vis struct #names_struct_ident {
#(#names_struct_fields),*
}
impl #impl_generics #ty #ty_generics
#where_clause
{
#vis fn get_field_names() -> &'static str {
// stringify!(
[ #(#vec_fields),* ]
.map( |s| s.to_string())
.collect()
// )
}
}
};
tokens.into()
}
fn filter_fields(fields: &Fields) -> Vec<(Visibility, Ident)> {
fields
.iter()
.filter_map(|field| {
if field
.attrs
.iter()
.find(|attr| has_skip_attr(attr, "struct_field_names"))
.is_none()
&& field.ident.is_some()
{
let field_vis = field.vis.clone();
let field_ident = field.ident.as_ref().unwrap().clone();
Some((field_vis, field_ident))
} else {
None
}
})
.collect::<Vec<_>>()
}
const ATTR_META_SKIP: &'static str = "skip";
fn has_skip_attr(attr: &Attribute, path: &'static str) -> bool {
if let Ok(Meta::List(meta_list)) = attr.parse_meta() {
if meta_list.path.is_ident(path) {
for nested_item in meta_list.nested.iter() {
if let NestedMeta::Meta(Meta::Path(path)) = nested_item {
if path.is_ident(ATTR_META_SKIP) {
return true;
}
}
}
}
}
false
}
The code it's taken from here. Basically I just want to get those values as a String, and not to access them via Foo::FIELD_NAMES.some_random_field, because I need them for another process.
How can I achieve that?
Thanks
This code walks the /tmp folder to show files that end in .txt:
const FOLDER_NAME: &str = "/tmp";
const PATTERN: &str = ".txt";
use std::error::Error;
use walkdir::WalkDir; // 2.2.9
fn main() -> Result<(), Box<dyn Error>> {
println!("Walking folder {}", FOLDER_NAME);
for entry in WalkDir::new(FOLDER_NAME).into_iter().filter_map(|e| e.ok()) {
let x = entry.file_name().to_str();
match x {
Some(x) if x.contains(PATTERN) => println!("This file matches: {:?}", entry),
_ => (),
}
}
Ok(())
}
Although this works, is it possible to leverage filter_map to do the suffix filtering that's currently happening in match?
You need to return the entry wrapped in a Some when the condition is true:
use std::error::Error;
use walkdir::WalkDir; // 2.2.9
const FOLDER_NAME: &str = "/tmp";
const PATTERN: &str = ".txt";
fn main() -> Result<(), Box<dyn Error>> {
println!("Walking folder {}", FOLDER_NAME);
let valid_entries = WalkDir::new(FOLDER_NAME)
.into_iter()
.flat_map(|e| e)
.flat_map(|e| {
let name = e.file_name().to_str()?;
if name.contains(PATTERN) {
Some(e)
} else {
None
}
});
for entry in valid_entries {
println!("This file matches: {:?}", entry);
}
Ok(())
}
You'll note that I've secretly switched to Iterator::flat_map. Iterator::filter_map would also work, but I find flat_map more ergonomic, especially for your "ignore the errors" case.
It's debatable whether this is useful compared to a regular Iterator::filter call:
let valid_entries = WalkDir::new(FOLDER_NAME)
.into_iter()
.flat_map(|e| e)
.filter(|e| {
e.file_name()
.to_str()
.map_or(false, |n| n.contains(PATTERN))
});
See also:
Why does `Option` support `IntoIterator`?
How can I filter an iterator when the predicate returns a Result<bool, _>?
The goal is to write a function that gets two paths, input_dir and output_dir, and convertes all markdown files from input_dir to html files in output_dir.
I finally managed to get it to run but it was rather frustrating. The parts that should be hard are super easy: the actual conversion from Markdown to HTML is effectively only one line. The seemingly easy parts are what took me the longest. Using a vector of paths and put all files into it is something I replaced with the glob crate. Not because I couldn't get it to work but it was a mess of if let and unwrap. A simple function that iterates over the list of elements and figures out which of them are actually files and not directories? Either I need four indentation levels if if let or I freak out over matches.
What am I doing wrong?
But lets start with some things I tried to get a list of items in a directory filtered to only contain actual files:
use std::fs;
use std::vec::Vec;
fn list_files (path: &str) -> Result<Vec<&str>, &str> {
if let Ok(dir_list) = fs::read_dir(path) {
Ok(dir_list.filter_map(|e| {
match e {
Ok(entry) => match entry.file_type() {
Ok(_) => entry.file_name().to_str(),
_ => None
},
_ => None
}
}).collect())
} else {
Err("nope")
}
}
fn main() {
let files = list_files("testdir");
println!("{:?}", files.unwrap_or(Vec::new()));
}
So, this code doesn't build, because the file name in Line 10 doesn't live long enough. I guess I could somehow create an owned String but that would introduce another nesting level because OsStr.to_string() returns a Result.
Now I looked through the code of the glob crate and they just use a mutable vector:
fn list_files (path: &str) -> Result<Vec<&str>, &str> {
let mut list = Vec::new();
if let Ok(dir_list) = fs::read_dir(path) {
for entry in dir_list {
if let Ok(entry) = entry {
if let Ok(file_type) = entry.file_type() {
if file_type.is_file() {
if let Some(name) = entry.file_name().to_str() {
list.push(name)
}
}
}
}
}
Ok(list)
} else {
Err("nope")
}
}
This not only adds crazy nesting, it also fails with the same problem. If I change from Vec<&str> to Vec<String>, it works:
fn list_files (path: &str) -> Result<Vec<String>, &str> {
let mut list = Vec::new();
if let Ok(dir_list) = fs::read_dir(path) {
for entry in dir_list {
if let Ok(entry) = entry {
if let Ok(file_type) = entry.file_type() {
if file_type.is_file() {
if let Ok(name) = entry.file_name().into_string() {
list.push(name)
}
}
}
}
}
Ok(list)
} else {
Err("nope")
}
}
Looks like I should apply that to my first try, right?
fn list_files (path: &str) -> Result<Vec<String>, &str> {
if let Ok(dir_list) = fs::read_dir(path) {
Ok(dir_list.filter_map(|e| {
match e {
Ok(entry) => match entry.file_type() {
Ok(_) => Some(entry.file_name().into_string().ok()),
_ => None
},
_ => None
}
}).collect())
} else {
Err("nope")
}
}
At least a bit shorter… but it fails to compile because a collection of type std::vec::Vec<std::string::String> cannot be built from an iterator over elements of type std::option::Option<std::string::String>.
It is hard to stay patient here. Why does .filter_map return Options instead of just using them to filter? Now I have to change line 15 from }).collect()) to }).map(|e| e.unwrap()).collect()) which iterates once more over the result set.
That can't be right!
You can massively rely on ? operator:
use std::fs;
use std::io::{Error, ErrorKind};
fn list_files(path: &str) -> Result<Vec<String>, Error> {
let mut list = Vec::new();
for entry in fs::read_dir(path)? {
let entry = entry?;
if entry.file_type()?.is_file() {
list.push(entry.file_name().into_string().map_err(|_| {
Error::new(ErrorKind::InvalidData, "Cannot convert file name")
})?)
}
}
Ok(list)
}
Do not forget that you can split your code into functions or implement your own traits to simplify the final code:
use std::fs;
use std::io::{Error, ErrorKind};
trait CustomGetFileName {
fn get_file_name(self) -> Result<String, Error>;
}
impl CustomGetFileName for std::fs::DirEntry {
fn get_file_name(self) -> Result<String, Error> {
Ok(self.file_name().into_string().map_err(|_|
Error::new(ErrorKind::InvalidData, "Cannot convert file name")
)?)
}
}
fn list_files(path: &str) -> Result<Vec<String>, Error> {
let mut list = Vec::new();
for entry in fs::read_dir(path)? {
let entry = entry?;
if entry.file_type()?.is_file() {
list.push(entry.get_file_name()?)
}
}
Ok(list)
}
An alternative answer with iterators, playground
use std::fs;
use std::error::Error;
use std::path::PathBuf;
fn list_files(path: &str) -> Result<Vec<PathBuf>, Box<Error>> {
let x = fs::read_dir(path)?
.filter_map(|e| e.ok())
.filter(|e| e.metadata().is_ok())
.filter(|e| e.metadata().unwrap().is_file())
.map(|e| e.path())
.collect();
Ok(x)
}
fn main() {
let path = ".";
for res in list_files(path).unwrap() {
println!("{:#?}", res);
}
}