std::fs::canonicalize for files that don't exist - rust

I'm writing a program in Rust that creates a file at a user-defined path. I need to be able to normalize intermediate components (~/ should become $HOME/, ../ should go up a directory, etc.) in order to create the file in the right place. std::fs::canonicalize does almost exactly what I want, but it panics if the path does not already exist.
Is there a function that normalizes componenets the same way as std::fs::canonicalize but doesn't panic if the file doesn't already exist?

There are good reasons such a function isn't standard:
there's no unique path when you're dealing with both links and non existing files. If a/b is a link to c/d/e, then a/b/../f could either mean a/f or c/d/f
the ~ shortcut is a shell feature. You may want to generalize it (I do), but that's a non obvious choice, especially when you consider ~ is a valid file name in most systems.
This being said, it's sometimes useful, in cases those ambiguities aren't a problem because of the nature of your application.
Here's what I do in such a case:
use {
directories::UserDirs,
lazy_regex::*,
std::path::{Path, PathBuf},
};
/// build a usable path from a user input which may be absolute
/// (if it starts with / or ~) or relative to the supplied base_dir.
/// (we might want to try detect windows drives in the future, too)
pub fn path_from<P: AsRef<Path>>(
base_dir: P,
input: &str,
) -> PathBuf {
let tilde = regex!(r"^~(/|$)");
if input.starts_with('/') {
// if the input starts with a `/`, we use it as is
input.into()
} else if tilde.is_match(input) {
// if the input starts with `~` as first token, we replace
// this `~` with the user home directory
PathBuf::from(
&*tilde
.replace(input, |c: &Captures| {
if let Some(user_dirs) = UserDirs::new() {
format!(
"{}{}",
user_dirs.home_dir().to_string_lossy(),
&c[1],
)
} else {
warn!("no user dirs found, no expansion of ~");
c[0].to_string()
}
})
)
} else {
// we put the input behind the source (the selected directory
// or its parent) and we normalize so that the user can type
// paths with `../`
normalize_path(base_dir.join(input))
}
}
/// Improve the path to try remove and solve .. token.
///
/// This assumes that `a/b/../c` is `a/c` which might be different from
/// what the OS would have chosen when b is a link. This is OK
/// for broot verb arguments but can't be generally used elsewhere
///
/// This function ensures a given path ending with '/' still
/// ends with '/' after normalization.
pub fn normalize_path<P: AsRef<Path>>(path: P) -> PathBuf {
let ends_with_slash = path.as_ref()
.to_str()
.map_or(false, |s| s.ends_with('/'));
let mut normalized = PathBuf::new();
for component in path.as_ref().components() {
match &component {
Component::ParentDir => {
if !normalized.pop() {
normalized.push(component);
}
}
_ => {
normalized.push(component);
}
}
}
if ends_with_slash {
normalized.push("");
}
normalized
}
(this uses the directories crate to get the home in a cross-platform way but other crates exist and you could also just read the $HOME env variable in most platforms)

Related

How to iterate over Option<&Path> with for loop structure?

I was iterating over a vector of Path references. Now I'm using a crate that instead returns me the Option<&Path> type. I've read that Option has an iterator. But is there a way to maintain my forloop structure such as:
let mut paths: Vec<&Option<&Path>> = vec![];
for i in paths {
...
}
Because I need to check if the current iterating path value exists. And I was doing that with:
if std::path::Path::new(&i).exists()
Obviously std::path::Path doesn't satisfy std::option::Option.
Is there a way for me to maintain this structure and satisfy the Option type? Particularly, how would I change my if statement to satisfy it?
You just need, for each value, to check that it's an Option::Some.
Since Option is indeed iterable, you can just use this property:
for i in paths.into_iter().flatten() {
...
}
an Option is conceptually a collection of 0 or 1 elements, by flattening it you remove the Option::None instances, and replace the Option::Some by whatever they contain (a Path).
Incidentally, why are you creating a Path from a Path? You can just check if i.exists() since i is a &Path already.
for path in paths {
if let Some(path) = path {
if path.exists() {
// ...
}
}
}
Or:
// `flatten` will turn the iterator of `Option<Path>` to an iterator of `Path`, discarding the `None` values
for path in paths.iter().flatten() {
if path.exists() {
// ...
}
}
Or:
for path in paths.iter().flatten().filter(|p| p.exists()) {
// ...
}

Is there a way to treat an absolute std::path::Path as a relative one when joining? [duplicate]

I think this should be quite doable, given that there is a nice function canonicalize which normalizes paths (so I can start by normalizing my two input paths) and Path and PathBuf give us a way of iterating over the parts of paths through components. I imagine something could be worked out here to factor out a common prefix, then prepend as many .. components as remain in the anchor path to what remains of the initial input path.
My problem seems to be pretty common:
How to find relative path given two absolute paths?
Find a path in Windows relative to another
This now exists as the pathdiff crate, using the code from kennytm's answer
You can use it as:
extern crate pathdiff;
pathdiff::diff_paths(path, base);
where base is where the relative path should be applied to obtain path
If one path is a base of another, you could use Path::strip_prefix, but it won't calculate the ../ for you (instead returns an Err):
use std::path::*;
let base = Path::new("/foo/bar");
let child_a = Path::new("/foo/bar/a");
let child_b = Path::new("/foo/bar/b");
println!("{:?}", child_a.strip_prefix(base)); // Ok("a")
println!("{:?}", child_a.strip_prefix(child_b)); // Err(StripPrefixError(()))
The previous incarnation of strip_prefix was path_relative_from which used to add ../, but this behavior was dropped due to symlinks:
The current behavior where joining the result onto the first path unambiguously refers to the same thing the second path does, even if there's symlinks (which basically means base needs to be a prefix of self)
The old behavior where the result can start with ../ components. Symlinks mean traversing the base path and then traversing the returned relative path may not put you in the same directory that traversing the self path does. But this operation is useful when either you're working with a path-based system that doesn't care about symlinks, or you've already resolved symlinks in the paths you're working with.
If you need the ../ behavior, you could copy the implementation from librustc_back (the compiler backend). I didn't find any packages on crates.io providing this yet.
// This routine is adapted from the *old* Path's `path_relative_from`
// function, which works differently from the new `relative_from` function.
// In particular, this handles the case on unix where both paths are
// absolute but with only the root as the common directory.
fn path_relative_from(path: &Path, base: &Path) -> Option<PathBuf> {
use std::path::Component;
if path.is_absolute() != base.is_absolute() {
if path.is_absolute() {
Some(PathBuf::from(path))
} else {
None
}
} else {
let mut ita = path.components();
let mut itb = base.components();
let mut comps: Vec<Component> = vec![];
loop {
match (ita.next(), itb.next()) {
(None, None) => break,
(Some(a), None) => {
comps.push(a);
comps.extend(ita.by_ref());
break;
}
(None, _) => comps.push(Component::ParentDir),
(Some(a), Some(b)) if comps.is_empty() && a == b => (),
(Some(a), Some(b)) if b == Component::CurDir => comps.push(a),
(Some(_), Some(b)) if b == Component::ParentDir => return None,
(Some(a), Some(_)) => {
comps.push(Component::ParentDir);
for _ in itb {
comps.push(Component::ParentDir);
}
comps.push(a);
comps.extend(ita.by_ref());
break;
}
}
}
Some(comps.iter().map(|c| c.as_os_str()).collect())
}
}

Why is clone required to be called explicitly with Strings in some cases but not others?

I was working in coding dojo trying to learn Rust. In the attached link is all our code and test. However, we got stumped as to why we required calling clone() in one function but not the other.
Why do I need to call game.clone() on line 23 of lib.rs in this link https://cyber-dojo.org/kata/edit/WvEB5z
pub fn say_game_score(game: Game) -> String {
if game.player1.score == game.player2.score {
return say_equal_score(game.player1.score);
}
if can_be_won(game) { // This line required game.clone() WHY???
return say_winning_situation(game); // This line does NOT require game.clone()
}
return format!(
"{} {}",
say_score_name(game.player1.score),
say_score_name(game.player2.score)
);
}
fn say_winning_situation(game: Game) -> String {
if game.player1.score > game.player2.score {
return say_leading_situation(game.player1.name, game.player1.score - game.player2.score);
} else {
return say_leading_situation(game.player2.name, game.player2.score - game.player1.score);
}
}
fn can_be_won(game: Game) -> bool {
return game.player1.score > FORTY || game.player2.score > FORTY;
}
can_be_won(game) causes the variable game to be moved into the function. When you then call say_winning_situation(game) the variable has already moved and cant be used anymore. The Rust compile can actually check these things.
The compiler suggests that you clone the game in the first invocation, so it will be copied instead of moved.
You probably want to use references instead of values in your functions. Only take ownership when you need it. For reading access a reference (which is const by default) is your first choice.
You should read about borrow checking in Rust.

Find the relative path from two absolute paths [duplicate]

I think this should be quite doable, given that there is a nice function canonicalize which normalizes paths (so I can start by normalizing my two input paths) and Path and PathBuf give us a way of iterating over the parts of paths through components. I imagine something could be worked out here to factor out a common prefix, then prepend as many .. components as remain in the anchor path to what remains of the initial input path.
My problem seems to be pretty common:
How to find relative path given two absolute paths?
Find a path in Windows relative to another
This now exists as the pathdiff crate, using the code from kennytm's answer
You can use it as:
extern crate pathdiff;
pathdiff::diff_paths(path, base);
where base is where the relative path should be applied to obtain path
If one path is a base of another, you could use Path::strip_prefix, but it won't calculate the ../ for you (instead returns an Err):
use std::path::*;
let base = Path::new("/foo/bar");
let child_a = Path::new("/foo/bar/a");
let child_b = Path::new("/foo/bar/b");
println!("{:?}", child_a.strip_prefix(base)); // Ok("a")
println!("{:?}", child_a.strip_prefix(child_b)); // Err(StripPrefixError(()))
The previous incarnation of strip_prefix was path_relative_from which used to add ../, but this behavior was dropped due to symlinks:
The current behavior where joining the result onto the first path unambiguously refers to the same thing the second path does, even if there's symlinks (which basically means base needs to be a prefix of self)
The old behavior where the result can start with ../ components. Symlinks mean traversing the base path and then traversing the returned relative path may not put you in the same directory that traversing the self path does. But this operation is useful when either you're working with a path-based system that doesn't care about symlinks, or you've already resolved symlinks in the paths you're working with.
If you need the ../ behavior, you could copy the implementation from librustc_back (the compiler backend). I didn't find any packages on crates.io providing this yet.
// This routine is adapted from the *old* Path's `path_relative_from`
// function, which works differently from the new `relative_from` function.
// In particular, this handles the case on unix where both paths are
// absolute but with only the root as the common directory.
fn path_relative_from(path: &Path, base: &Path) -> Option<PathBuf> {
use std::path::Component;
if path.is_absolute() != base.is_absolute() {
if path.is_absolute() {
Some(PathBuf::from(path))
} else {
None
}
} else {
let mut ita = path.components();
let mut itb = base.components();
let mut comps: Vec<Component> = vec![];
loop {
match (ita.next(), itb.next()) {
(None, None) => break,
(Some(a), None) => {
comps.push(a);
comps.extend(ita.by_ref());
break;
}
(None, _) => comps.push(Component::ParentDir),
(Some(a), Some(b)) if comps.is_empty() && a == b => (),
(Some(a), Some(b)) if b == Component::CurDir => comps.push(a),
(Some(_), Some(b)) if b == Component::ParentDir => return None,
(Some(a), Some(_)) => {
comps.push(Component::ParentDir);
for _ in itb {
comps.push(Component::ParentDir);
}
comps.push(a);
comps.extend(ita.by_ref());
break;
}
}
}
Some(comps.iter().map(|c| c.as_os_str()).collect())
}
}

Idiomatic rust way to properly parse Clap ArgMatches

I'm learning rust and trying to make a find like utility (yes another one), im using clap and trying to support command line and config file for the program's parameters(this has nothing to do with the clap yml file).
Im trying to parse the commands and if no commands were passed to the app, i will try to load them from a config file.
Now I don't know how to do this in an idiomatic way.
fn main() {
let matches = App::new("findx")
.version(crate_version!())
.author(crate_authors!())
.about("find + directory operations utility")
.arg(
Arg::with_name("paths")
...
)
.arg(
Arg::with_name("patterns")
...
)
.arg(
Arg::with_name("operation")
...
)
.get_matches();
let paths;
let patterns;
let operation;
if matches.is_present("patterns") && matches.is_present("operation") {
patterns = matches.values_of("patterns").unwrap().collect();
paths = matches.values_of("paths").unwrap_or(clap::Values<&str>{"./"}).collect(); // this doesn't work
operation = match matches.value_of("operation").unwrap() { // I dont like this
"Append" => Operation::Append,
"Prepend" => Operation::Prepend,
"Rename" => Operation::Rename,
_ => {
print!("Operation unsupported");
process::exit(1);
}
};
}else if Path::new("findx.yml").is_file(){
//TODO: try load from config file
}else{
eprintln!("Command line parameters or findx.yml file must be provided");
process::exit(1);
}
if let Err(e) = findx::run(Config {
paths: paths,
patterns: patterns,
operation: operation,
}) {
eprintln!("Application error: {}", e);
process::exit(1);
}
}
There is an idiomatic way to extract Option and Result types values to the same scope, i mean all examples that i have read, uses match or if let Some(x) to consume the x value inside the scope of the pattern matching, but I need to assign the value to a variable.
Can someone help me with this, or point me to the right direction?
Best Regards
Personally I see nothing wrong with using the match statements and folding it or placing it in another function. But if you want to remove it there are many options.
There is the ability to use the .default_value_if() method which is impl for clap::Arg and have a different default value depending on which match arm is matched.
From the clap documentation
//sets value of arg "other" to "default" if value of "--opt" is "special"
let m = App::new("prog")
.arg(Arg::with_name("opt")
.takes_value(true)
.long("opt"))
.arg(Arg::with_name("other")
.long("other")
.default_value_if("opt", Some("special"), "default"))
.get_matches_from(vec![
"prog", "--opt", "special"
]);
assert_eq!(m.value_of("other"), Some("default"));
In addition you can add a validator to your operation OR convert your valid operation values into flags.
Here's an example converting your match arm values into individual flags (smaller example for clarity).
extern crate clap;
use clap::{Arg,App};
fn command_line_interface<'a>() -> clap::ArgMatches<'a> {
//Sets the command line interface of the program.
App::new("something")
.version("0.1")
.arg(Arg::with_name("rename")
.help("renames something")
.short("r")
.long("rename"))
.arg(Arg::with_name("prepend")
.help("prepends something")
.short("p")
.long("prepend"))
.arg(Arg::with_name("append")
.help("appends something")
.short("a")
.long("append"))
.get_matches()
}
#[derive(Debug)]
enum Operation {
Rename,
Append,
Prepend,
}
fn main() {
let matches = command_line_interface();
let operation = if matches.is_present("rename") {
Operation::Rename
} else if matches.is_present("prepend"){
Operation::Prepend
} else {
//DEFAULT
Operation::Append
};
println!("Value of operation is {:?}",operation);
}
I hope this helps!
EDIT:
You can also use Subcommands with your specific operations. It all depends on what you want to interface to be like.
let app_m = App::new("git")
.subcommand(SubCommand::with_name("clone"))
.subcommand(SubCommand::with_name("push"))
.subcommand(SubCommand::with_name("commit"))
.get_matches();
match app_m.subcommand() {
("clone", Some(sub_m)) => {}, // clone was used
("push", Some(sub_m)) => {}, // push was used
("commit", Some(sub_m)) => {}, // commit was used
_ => {}, // Either no subcommand or one not tested for...
}

Resources