How do I change "C:\foo\bar.txt" to "C:\foo\baz\bar.txt" using either Path or PathBuf?
I want to add a folder to the path immediately before the filename.
The Path type supports a number of methods to manipulate and destructure paths, so it should be straightforward to append a directory. For example:
fn append_dir(p: &Path, d: &str) -> PathBuf {
let dirs = p.parent().unwrap();
dirs.join(d).join(p.file_name().unwrap())
}
I'm testing it on Linux, so for me the test looks like this, but on Windows you should be able to use C:\... just fine:
fn main() {
let p = Path::new(r"/foo/bar.txt");
assert_eq!(append_dir(&p, "baz"), Path::new(r"/foo/baz/bar.txt"));
}
Related
I want to be able to parse all the files in a directory to find the one with the greatest timestamp that matches a user provided pattern.
I.e. if the user runs
$ search /foo/bar/baz.txt
and the directory /foo/bar/ contains files baz.001.txt, baz.002.txt, and baz.003.txt, then the result should be baz.003.txt
At the moment I'm constructing a PathBuf.
Using that to build a Regex.
Then finding all the files in the directory that match the expression.
But it feels like this is a lot of work for a relatively simple problem.
fn find(foo: &str) -> Result<Vec<String>, Box<dyn Error>> {
let mut files = vec![];
let mut path = PathBuf::from(foo);
let base = path.parent().unwrap().to_str().unwrap();
let file_name = path.file_stem().unwrap().to_str().unwrap();
let extension = path.extension().unwrap().to_str().unwrap();
let pattern = format!("{}\\.\\d{{3}}\\.{}", file_name, extension);
let expression = Regex::new(&pattern).unwrap();
let objects: Vec<String> = fs::read_dir(&base)
.unwrap()
.map(|entry| {
entry
.unwrap()
.path()
.file_name()
.unwrap()
.to_str()
.unwrap()
.to_owned()
})
.collect();
for object in objects.iter() {
if expression.is_match(object) {
files.push(String::from(object));
}
}
Ok(files)
}
Is there an easier way to take the file path, generate a pattern, and find all the matching files?
Rust is not really a language appropriated for quick and dirty solutions. Instead, it strongly incentivizes elegant solutions, where all corner cases are properly handled. This usually does not lead to extremely short solutions, but you can avoid too much boilerplate relying on external crates that factor a lot of code. Here is what I would do, assuming you don't already have a "library-wide" error.
fn find(foo: &str) -> Result<Vec<String>, FindError> {
let path = PathBuf::from(foo);
let base = path
.parent()
.ok_or(FindError::InvalidBaseFile)?
.to_str()
.ok_or(FindError::OsStringNotUtf8)?;
let file_name = path
.file_stem()
.ok_or(FindError::InvalidFileName)?
.to_str()
.ok_or(FindError::OsStringNotUtf8)?;
let file_extension = path
.extension()
.ok_or(FindError::NoFileExtension)?
.to_str()
.ok_or(FindError::OsStringNotUtf8)?;
let pattern = format!(r"{}\.\d{{3}}\.{}", file_name, file_extension);
let expression = Regex::new(&pattern)?;
Ok(
fs::read_dir(&base)?
.map(|entry| Ok(
entry?
.path()
.file_name()
.ok_or(FindError::InvalidFileName)?
.to_str()
.ok_or(FindError::OsStringNotUtf8)?
.to_string()
))
.collect::<Result<Vec<_>, FindError>>()?
.into_iter()
.filter(|file_name| expression.is_match(&file_name))
.collect()
)
}
A simplistic definition of FindError could be achieved via the thiserror crate:
use thiserror::Error;
#[derive(Error, Debug)]
enum FindError {
#[error(transparent)]
RegexError(#[from] regex::Error),
#[error("File name has no extension")]
NoFileExtension,
#[error("Not a valid file name")]
InvalidFileName,
#[error("No valid base file")]
InvalidBaseFile,
#[error("An OS string is not valid utf-8")]
OsStringNotUtf8,
#[error(transparent)]
IoError(#[from] std::io::Error),
}
Edit
As pointed out by #Masklinn, you can retrieve the stem and the extension of the file without all that hassle. It results in less-well handled errors (and some corner cases such as a hidden file without extension get handled poorly), but overall less verbose code. For you to chose depending on your needs.
fn find(foo: &str) -> Result<Vec<String>, FindError> {
let (file_name, file_extension) = foo
.rsplit_one('.')
.ok_or(FindError::NoExtension)?;
... // the rest is unchanged
}
You probably need to adapt FindError too. You can also get rid of the ok_or case, and just replace it with a .unwrap_or((foo, "")) if you don't really care about it (however this will give surprising results...).
i can't read a path in my network using this code.
Maybe fs::read_dir work only for local dirs.
use std::fs;
fn main() {
let paths = fs::read_dir("\\nbsvr01").unwrap();
for path in paths {
println!("Name: {}", path.unwrap().path().display())
}
}
As #rodrigo said, you can use r"\\nbsvr01", which is more straightforward and easy to unserstand.
If you are not familiar with this, check this out:)
I have a requirement to pass to a procedural macro either a text file or the contents of a text file, such that the procedural macro acts based on the contents of that text file at compile time. That is, the text file configures the output of the macro. The use case for this is the file defining a register map which the macro builds into a library.
The second requirement is that the text file is properly handled by Cargo, such that changes to the text file trigger a recompile in the same way as changes to the source file trigger a recompile.
My initial thought was to create a static string using the include_str! macro. This solves the second requirement but I can't see how to pass that to the macro - at that point I only have the identifier of the string to pass in:
use my_macro_lib::my_macro;
static MYSTRING: &'static str = include_str!("myfile");
my_macro!(MYSTRING); // Not the string itself!
I can pass a string to the macro with the name of the file in a string literal, and open the file inside the macro:
my_macro!("myfile");
At which point I have two problems:
It's not obvious how to get the path of the calling function in order to get the path of the file. I initially thought this would be exposed through the token Span, but it seems in general not (perhaps I'm missing something?).
It's not obvious how to make the file make Cargo trigger a recompile on changes. One idea I had to force this was to add an include_str!("myfile") to the output of the macro, which would hopefully result in the compile being made aware of "myfile", but this is a bit mucky.
Is there some way to do what I'm trying to do? Perhaps either by somehow getting the contents of the string inside the macro that was created outside, or reliably getting the path of the calling rust file (then making Cargo treat changes properly).
As an aside, I've read various places that tell me I can't get access to the contents of variables inside the macro, but it seems to me that this is exactly what the quote macro is doing with #variables. How is this working?
So it turns out this is possible in essentially the way I was hoping with the stable compiler.
If we accept that we need to work relative to the crate root, we can define our paths as such.
Helpfully, inside the macro code, std::env::current_dir() will return the current working directory as the root of the crate containing the call site. This means, even if the macro invocation is inside some crate hierarchy, it will still return a path that is meaningful at the location of the macro invocation.
The following example macro does essentially what I need. For brevity, it's not designed to handle errors properly:
extern crate proc_macro;
use quote::quote;
use proc_macro::TokenStream;
use syn::parse::{Parse, ParseStream, Result};
use syn;
use std;
use std::fs::File;
use std::io::Read;
#[derive(Debug)]
struct FileName {
filename: String,
}
impl Parse for FileName {
fn parse(input: ParseStream) -> Result<Self> {
let lit_file: syn::LitStr = input.parse()?;
Ok(Self { filename: lit_file.value() })
}
}
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input = syn::parse_macro_input!(input as FileName);
let cwd = std::env::current_dir().unwrap();
let file_path = cwd.join(&input.filename);
let file_path_str = format!("{}", file_path.display());
println!("path: {}", file_path.display());
let mut file = File::open(file_path).unwrap();
let mut contents = String::new();
file.read_to_string(&mut contents).unwrap();
println!("contents: {:?}", contents);
let result = quote!(
const FILE_STR: &'static str = include_str!(#file_path_str);
pub fn foo() -> bool {
println!("Hello");
true
}
);
TokenStream::from(result)
}
Which can be invoked with
my_macro!("mydir/myfile");
where mydir is a directory in the root of the invoking crate.
This uses the hack of using an include_str!() in the macro output to cause rebuilds on changes to myfile. This is necessary and does what is expected. I would expect this to be optimised out if it's never actually used.
I'd be interested to know if this approach falls over in any situation.
Relevant to my original question, current nightly implements the source_file() method on Span. This might be a better way to implement the above, but I'd rather stick with stable. The tracking issue for this is here.
Edit:
The above implementation fails when the package is in a workspace, at which point the current working directory is the workspace root, not the crate root. This is easy to work around with something like as follows (inserted between cwd and file_path declarations).
let mut cwd = std::env::current_dir().unwrap();
let cargo_path = cwd.join("Cargo.toml");
let mut cargo_file = File::open(cargo_path).unwrap();
let mut cargo_contents = String::new();
cargo_file.read_to_string(&mut cargo_contents).unwrap();
// Use a simple regex to detect the suitable tag in the toml file. Much
// simpler than using the toml crate and probably good enough according to
// the workspace RFC.
let cargo_re = regex::Regex::new(r"(?m)^\[workspace\][ \t]*$").unwrap();
let workspace_path = match cargo_re.find(&cargo_contents) {
Some(val) => std::env::var("CARGO_PKG_NAME"),
None => "".to_string()
};
let file_path = cwd.join(workspace_path).join(input.filename);
let file_path_str = format!("{}", file_path.display());
Using the following program:
use std::path::Path;
fn main() {
println!("{:?}", Path::new("P:").join("A_B_C\\D\\E\\F\\G.hij"));
}
POSIX will give you:
"P:/A_B_C\\D\\E\\F\\G.hij"
But Windows will give you:
"P:A_B_C\\D\\E\\F\\G.hij"
The latter isn't considered to be the intended path, at least by std::fs::copy.
For the same of an example:
fn my_function(p: &Path) -> PathBuf {
p.join("Temp")
}
Firstly, note that when you specify drive letter without trailing slash symbol, Windows API interpret it as a relative path to the current directory on the drive. I.e. P: and P:\ could reference to different locations, and P:file.txt is a valid path and means P:\current\dir\file.txt. You could verify it by change directory and call from command prompt dir P: and dir P:\.
If you are sure that you want to interpret "P:" as a root path then you probably should manually detect it and add root slash, but I believe it is a bad practice.
For strictly interpret path prefix and build an absolute path prefix you could use Path::canonicalize() method, but please keep in mind that it works only for actually existent drive/path in the target OS.
use std::path::{Path, PathBuf};
fn canonical_join<P: AsRef<Path>>(a: P, b: P) -> PathBuf {
let a = a.as_ref();
a.canonicalize()
.unwrap_or(PathBuf::from(a))
.join(b)
}
fn main() {
println!("{}", canonical_join("C:", "dir\\file.ext").display());
println!("{}", canonical_join("C:\\", "dir\\file.ext").display());
println!("{}", canonical_join("C:/", "dir\\file.ext").display());
}
In Python I can:
from distutils import spawn
cmd = spawn.find_executable("commandname")
I tried something like the code below, but it it assumes you're on unix-like system with /usr/bin/which available(also it involves execution of external command which I want to avoid):
use std::process::Command;
let output = Command::new("which")
.arg("commandname")
.unwrap_or_else(|e| /* handle error here */)
What is the simplest way to do this in Rust?
I found a crate that solves the problem: which. It includes Windows support, even accounting for PATHEXT.
I'd probably grab the environment variable and iterate through it, returning the first matching path:
use std::env;
use std::path::{Path, PathBuf};
fn find_it<P>(exe_name: P) -> Option<PathBuf>
where P: AsRef<Path>,
{
env::var_os("PATH").and_then(|paths| {
env::split_paths(&paths).filter_map(|dir| {
let full_path = dir.join(&exe_name);
if full_path.is_file() {
Some(full_path)
} else {
None
}
}).next()
})
}
fn main() {
println!("{:?}", find_it("cat"));
println!("{:?}", find_it("dog"));
}
This is probably ugly on Windows as you'd have to append the .exe to the executable name. It should also potentially be extended to only return items that are executable, which is again platform-specific code.
Reviewing the Python implementation, it appears they also support an absolute path being passed. That's up to you if the function should support that or not.
A quick search on crates.io returned one crate that may be useful: quale, although it currently says
currently only works on Unix-like operating systems.
It wouldn't surprise me to find out there are others.
Here's some ugly code that adds .exe to the end if it's missing, but only on Windows.
#[cfg(not(target_os = "windows"))]
fn enhance_exe_name(exe_name: &Path) -> Cow<Path> {
exe_name.into()
}
#[cfg(target_os = "windows")]
fn enhance_exe_name(exe_name: &Path) -> Cow<Path> {
use std::ffi::OsStr;
use std::os::windows::ffi::OsStrExt;
let raw_input: Vec<_> = exe_name.as_os_str().encode_wide().collect();
let raw_extension: Vec<_> = OsStr::new(".exe").encode_wide().collect();
if raw_input.ends_with(&raw_extension) {
exe_name.into()
} else {
let mut with_exe = exe_name.as_os_str().to_owned();
with_exe.push(".exe");
PathBuf::from(with_exe).into()
}
}
// At the top of the `find_it` function:
// let exe_name = enhance_exe_name(exe_name.as_ref());