Find the relative path from two absolute paths [duplicate] - rust

I think this should be quite doable, given that there is a nice function canonicalize which normalizes paths (so I can start by normalizing my two input paths) and Path and PathBuf give us a way of iterating over the parts of paths through components. I imagine something could be worked out here to factor out a common prefix, then prepend as many .. components as remain in the anchor path to what remains of the initial input path.
My problem seems to be pretty common:
How to find relative path given two absolute paths?
Find a path in Windows relative to another

This now exists as the pathdiff crate, using the code from kennytm's answer
You can use it as:
extern crate pathdiff;
pathdiff::diff_paths(path, base);
where base is where the relative path should be applied to obtain path

If one path is a base of another, you could use Path::strip_prefix, but it won't calculate the ../ for you (instead returns an Err):
use std::path::*;
let base = Path::new("/foo/bar");
let child_a = Path::new("/foo/bar/a");
let child_b = Path::new("/foo/bar/b");
println!("{:?}", child_a.strip_prefix(base)); // Ok("a")
println!("{:?}", child_a.strip_prefix(child_b)); // Err(StripPrefixError(()))
The previous incarnation of strip_prefix was path_relative_from which used to add ../, but this behavior was dropped due to symlinks:
The current behavior where joining the result onto the first path unambiguously refers to the same thing the second path does, even if there's symlinks (which basically means base needs to be a prefix of self)
The old behavior where the result can start with ../ components. Symlinks mean traversing the base path and then traversing the returned relative path may not put you in the same directory that traversing the self path does. But this operation is useful when either you're working with a path-based system that doesn't care about symlinks, or you've already resolved symlinks in the paths you're working with.
If you need the ../ behavior, you could copy the implementation from librustc_back (the compiler backend). I didn't find any packages on crates.io providing this yet.
// This routine is adapted from the *old* Path's `path_relative_from`
// function, which works differently from the new `relative_from` function.
// In particular, this handles the case on unix where both paths are
// absolute but with only the root as the common directory.
fn path_relative_from(path: &Path, base: &Path) -> Option<PathBuf> {
use std::path::Component;
if path.is_absolute() != base.is_absolute() {
if path.is_absolute() {
Some(PathBuf::from(path))
} else {
None
}
} else {
let mut ita = path.components();
let mut itb = base.components();
let mut comps: Vec<Component> = vec![];
loop {
match (ita.next(), itb.next()) {
(None, None) => break,
(Some(a), None) => {
comps.push(a);
comps.extend(ita.by_ref());
break;
}
(None, _) => comps.push(Component::ParentDir),
(Some(a), Some(b)) if comps.is_empty() && a == b => (),
(Some(a), Some(b)) if b == Component::CurDir => comps.push(a),
(Some(_), Some(b)) if b == Component::ParentDir => return None,
(Some(a), Some(_)) => {
comps.push(Component::ParentDir);
for _ in itb {
comps.push(Component::ParentDir);
}
comps.push(a);
comps.extend(ita.by_ref());
break;
}
}
}
Some(comps.iter().map(|c| c.as_os_str()).collect())
}
}

Related

How to iterate over Option<&Path> with for loop structure?

I was iterating over a vector of Path references. Now I'm using a crate that instead returns me the Option<&Path> type. I've read that Option has an iterator. But is there a way to maintain my forloop structure such as:
let mut paths: Vec<&Option<&Path>> = vec![];
for i in paths {
...
}
Because I need to check if the current iterating path value exists. And I was doing that with:
if std::path::Path::new(&i).exists()
Obviously std::path::Path doesn't satisfy std::option::Option.
Is there a way for me to maintain this structure and satisfy the Option type? Particularly, how would I change my if statement to satisfy it?
You just need, for each value, to check that it's an Option::Some.
Since Option is indeed iterable, you can just use this property:
for i in paths.into_iter().flatten() {
...
}
an Option is conceptually a collection of 0 or 1 elements, by flattening it you remove the Option::None instances, and replace the Option::Some by whatever they contain (a Path).
Incidentally, why are you creating a Path from a Path? You can just check if i.exists() since i is a &Path already.
for path in paths {
if let Some(path) = path {
if path.exists() {
// ...
}
}
}
Or:
// `flatten` will turn the iterator of `Option<Path>` to an iterator of `Path`, discarding the `None` values
for path in paths.iter().flatten() {
if path.exists() {
// ...
}
}
Or:
for path in paths.iter().flatten().filter(|p| p.exists()) {
// ...
}

Is there a way to treat an absolute std::path::Path as a relative one when joining? [duplicate]

I think this should be quite doable, given that there is a nice function canonicalize which normalizes paths (so I can start by normalizing my two input paths) and Path and PathBuf give us a way of iterating over the parts of paths through components. I imagine something could be worked out here to factor out a common prefix, then prepend as many .. components as remain in the anchor path to what remains of the initial input path.
My problem seems to be pretty common:
How to find relative path given two absolute paths?
Find a path in Windows relative to another
This now exists as the pathdiff crate, using the code from kennytm's answer
You can use it as:
extern crate pathdiff;
pathdiff::diff_paths(path, base);
where base is where the relative path should be applied to obtain path
If one path is a base of another, you could use Path::strip_prefix, but it won't calculate the ../ for you (instead returns an Err):
use std::path::*;
let base = Path::new("/foo/bar");
let child_a = Path::new("/foo/bar/a");
let child_b = Path::new("/foo/bar/b");
println!("{:?}", child_a.strip_prefix(base)); // Ok("a")
println!("{:?}", child_a.strip_prefix(child_b)); // Err(StripPrefixError(()))
The previous incarnation of strip_prefix was path_relative_from which used to add ../, but this behavior was dropped due to symlinks:
The current behavior where joining the result onto the first path unambiguously refers to the same thing the second path does, even if there's symlinks (which basically means base needs to be a prefix of self)
The old behavior where the result can start with ../ components. Symlinks mean traversing the base path and then traversing the returned relative path may not put you in the same directory that traversing the self path does. But this operation is useful when either you're working with a path-based system that doesn't care about symlinks, or you've already resolved symlinks in the paths you're working with.
If you need the ../ behavior, you could copy the implementation from librustc_back (the compiler backend). I didn't find any packages on crates.io providing this yet.
// This routine is adapted from the *old* Path's `path_relative_from`
// function, which works differently from the new `relative_from` function.
// In particular, this handles the case on unix where both paths are
// absolute but with only the root as the common directory.
fn path_relative_from(path: &Path, base: &Path) -> Option<PathBuf> {
use std::path::Component;
if path.is_absolute() != base.is_absolute() {
if path.is_absolute() {
Some(PathBuf::from(path))
} else {
None
}
} else {
let mut ita = path.components();
let mut itb = base.components();
let mut comps: Vec<Component> = vec![];
loop {
match (ita.next(), itb.next()) {
(None, None) => break,
(Some(a), None) => {
comps.push(a);
comps.extend(ita.by_ref());
break;
}
(None, _) => comps.push(Component::ParentDir),
(Some(a), Some(b)) if comps.is_empty() && a == b => (),
(Some(a), Some(b)) if b == Component::CurDir => comps.push(a),
(Some(_), Some(b)) if b == Component::ParentDir => return None,
(Some(a), Some(_)) => {
comps.push(Component::ParentDir);
for _ in itb {
comps.push(Component::ParentDir);
}
comps.push(a);
comps.extend(ita.by_ref());
break;
}
}
}
Some(comps.iter().map(|c| c.as_os_str()).collect())
}
}

Cannot get Hash::get_mut() and File::open() to agree about mutability

During a lengthy computation, I need to look up some data in a number of different files. I cannot know beforehand how many or which files exactly, but chances are high that each file is used many times (on the order of 100 million times).
In the first version, I opened the file (whose name is an intermediate result of the computation) each time for lookup.
In the second version, I have a HashMap<String, Box<File>> where I remember already open files and open new ones lazily on demand.
I couldn't manage to handle the mutable stuff that arises from the need to have Files to be mutable. I got something working, but it looks overly silly:
let path = format!("egtb/{}.egtb", self.signature());
let hentry = hash.get_mut(&self.signature());
let mut file = match hentry {
Some(f) => f,
None => {
let rfile = File::open(&path);
let wtf = Box::new(match rfile {
Err(ioe) => return Err(format!("could not open EGTB file {} ({})", path, ioe)),
Ok(opened) => opened,
});
hash.insert(self.signature(), wtf);
// the following won't work
// wtf
// &wtf
// &mut wtf
// So I came up with the following, but it doesn't feel right, does it?
hash.get_mut(&self.signature()).unwrap()
}
};
Is there a canonical way to get a mut File from File::open() or File::create()? In the manuals, this is always done with:
let mut file = File:open("foo.txt")?;
This means my function would have to return Result<_, io::Error> and I can't have that.
The problem seems to be that with the hash-lookup Some(f) gives me a &mut File but the Ok(f) from File::open gives me just a File, and I don't know how to make a mutable reference from that, so that the match arm's types match. I have no clear idea why the version as above at least compiles, but I'd very much like to learn how to do that without getting the File from the HashMap again.
Attempts to use wtf after it has been inserted into the hashmap fail to compile because the value was moved into the hashmap. Instead, you need to obtain the reference into the value stored in the hashmap. To do so without a second lookup, you can use the entry API:
let path = format!("egtb/{}.egtb", self.signature());
let mut file = match hash.entry(self.signature()) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let rfile = File::open(&path)
.map_err(|_| format!("could not open EGTB file {} ({})", path, ioe))?;
e.insert(Box::new(rfile))
}
};
// `file` is `&mut File`, use it as needed
Note that map_err() allows you to use ? even when your function returns a Result not immediately compatible with the one you have.
Also note that there is no reason to box the File, a HashMap<String, File> would work just as nicely.

std::fs::canonicalize for files that don't exist

I'm writing a program in Rust that creates a file at a user-defined path. I need to be able to normalize intermediate components (~/ should become $HOME/, ../ should go up a directory, etc.) in order to create the file in the right place. std::fs::canonicalize does almost exactly what I want, but it panics if the path does not already exist.
Is there a function that normalizes componenets the same way as std::fs::canonicalize but doesn't panic if the file doesn't already exist?
There are good reasons such a function isn't standard:
there's no unique path when you're dealing with both links and non existing files. If a/b is a link to c/d/e, then a/b/../f could either mean a/f or c/d/f
the ~ shortcut is a shell feature. You may want to generalize it (I do), but that's a non obvious choice, especially when you consider ~ is a valid file name in most systems.
This being said, it's sometimes useful, in cases those ambiguities aren't a problem because of the nature of your application.
Here's what I do in such a case:
use {
directories::UserDirs,
lazy_regex::*,
std::path::{Path, PathBuf},
};
/// build a usable path from a user input which may be absolute
/// (if it starts with / or ~) or relative to the supplied base_dir.
/// (we might want to try detect windows drives in the future, too)
pub fn path_from<P: AsRef<Path>>(
base_dir: P,
input: &str,
) -> PathBuf {
let tilde = regex!(r"^~(/|$)");
if input.starts_with('/') {
// if the input starts with a `/`, we use it as is
input.into()
} else if tilde.is_match(input) {
// if the input starts with `~` as first token, we replace
// this `~` with the user home directory
PathBuf::from(
&*tilde
.replace(input, |c: &Captures| {
if let Some(user_dirs) = UserDirs::new() {
format!(
"{}{}",
user_dirs.home_dir().to_string_lossy(),
&c[1],
)
} else {
warn!("no user dirs found, no expansion of ~");
c[0].to_string()
}
})
)
} else {
// we put the input behind the source (the selected directory
// or its parent) and we normalize so that the user can type
// paths with `../`
normalize_path(base_dir.join(input))
}
}
/// Improve the path to try remove and solve .. token.
///
/// This assumes that `a/b/../c` is `a/c` which might be different from
/// what the OS would have chosen when b is a link. This is OK
/// for broot verb arguments but can't be generally used elsewhere
///
/// This function ensures a given path ending with '/' still
/// ends with '/' after normalization.
pub fn normalize_path<P: AsRef<Path>>(path: P) -> PathBuf {
let ends_with_slash = path.as_ref()
.to_str()
.map_or(false, |s| s.ends_with('/'));
let mut normalized = PathBuf::new();
for component in path.as_ref().components() {
match &component {
Component::ParentDir => {
if !normalized.pop() {
normalized.push(component);
}
}
_ => {
normalized.push(component);
}
}
}
if ends_with_slash {
normalized.push("");
}
normalized
}
(this uses the directories crate to get the home in a cross-platform way but other crates exist and you could also just read the $HOME env variable in most platforms)

Idiomatic rust way to properly parse Clap ArgMatches

I'm learning rust and trying to make a find like utility (yes another one), im using clap and trying to support command line and config file for the program's parameters(this has nothing to do with the clap yml file).
Im trying to parse the commands and if no commands were passed to the app, i will try to load them from a config file.
Now I don't know how to do this in an idiomatic way.
fn main() {
let matches = App::new("findx")
.version(crate_version!())
.author(crate_authors!())
.about("find + directory operations utility")
.arg(
Arg::with_name("paths")
...
)
.arg(
Arg::with_name("patterns")
...
)
.arg(
Arg::with_name("operation")
...
)
.get_matches();
let paths;
let patterns;
let operation;
if matches.is_present("patterns") && matches.is_present("operation") {
patterns = matches.values_of("patterns").unwrap().collect();
paths = matches.values_of("paths").unwrap_or(clap::Values<&str>{"./"}).collect(); // this doesn't work
operation = match matches.value_of("operation").unwrap() { // I dont like this
"Append" => Operation::Append,
"Prepend" => Operation::Prepend,
"Rename" => Operation::Rename,
_ => {
print!("Operation unsupported");
process::exit(1);
}
};
}else if Path::new("findx.yml").is_file(){
//TODO: try load from config file
}else{
eprintln!("Command line parameters or findx.yml file must be provided");
process::exit(1);
}
if let Err(e) = findx::run(Config {
paths: paths,
patterns: patterns,
operation: operation,
}) {
eprintln!("Application error: {}", e);
process::exit(1);
}
}
There is an idiomatic way to extract Option and Result types values to the same scope, i mean all examples that i have read, uses match or if let Some(x) to consume the x value inside the scope of the pattern matching, but I need to assign the value to a variable.
Can someone help me with this, or point me to the right direction?
Best Regards
Personally I see nothing wrong with using the match statements and folding it or placing it in another function. But if you want to remove it there are many options.
There is the ability to use the .default_value_if() method which is impl for clap::Arg and have a different default value depending on which match arm is matched.
From the clap documentation
//sets value of arg "other" to "default" if value of "--opt" is "special"
let m = App::new("prog")
.arg(Arg::with_name("opt")
.takes_value(true)
.long("opt"))
.arg(Arg::with_name("other")
.long("other")
.default_value_if("opt", Some("special"), "default"))
.get_matches_from(vec![
"prog", "--opt", "special"
]);
assert_eq!(m.value_of("other"), Some("default"));
In addition you can add a validator to your operation OR convert your valid operation values into flags.
Here's an example converting your match arm values into individual flags (smaller example for clarity).
extern crate clap;
use clap::{Arg,App};
fn command_line_interface<'a>() -> clap::ArgMatches<'a> {
//Sets the command line interface of the program.
App::new("something")
.version("0.1")
.arg(Arg::with_name("rename")
.help("renames something")
.short("r")
.long("rename"))
.arg(Arg::with_name("prepend")
.help("prepends something")
.short("p")
.long("prepend"))
.arg(Arg::with_name("append")
.help("appends something")
.short("a")
.long("append"))
.get_matches()
}
#[derive(Debug)]
enum Operation {
Rename,
Append,
Prepend,
}
fn main() {
let matches = command_line_interface();
let operation = if matches.is_present("rename") {
Operation::Rename
} else if matches.is_present("prepend"){
Operation::Prepend
} else {
//DEFAULT
Operation::Append
};
println!("Value of operation is {:?}",operation);
}
I hope this helps!
EDIT:
You can also use Subcommands with your specific operations. It all depends on what you want to interface to be like.
let app_m = App::new("git")
.subcommand(SubCommand::with_name("clone"))
.subcommand(SubCommand::with_name("push"))
.subcommand(SubCommand::with_name("commit"))
.get_matches();
match app_m.subcommand() {
("clone", Some(sub_m)) => {}, // clone was used
("push", Some(sub_m)) => {}, // push was used
("commit", Some(sub_m)) => {}, // commit was used
_ => {}, // Either no subcommand or one not tested for...
}

Resources