Assign value from match statement - rust

I'm trying to make a Git command in Rust. I'm using the clap argument parser crate to do the command line handling. I want my command to take an optional argument for which directory to do work in. If the command does not receive the option it assumes the users home directory.
I know that I can use the std::env::home_dir function to get the user's home directory if it is set but the part that confuses me is how to properly use the match operator to get the value of the path. Here is what I've been trying:
use std::env;
fn main() {
// Do some argument parsing stuff...
let some_dir = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap()
} else {
match env::home_dir() {
Some(path) => path.to_str(),
None => panic!("Uh, oh!"),
}
};
// Do more things
I get an error message when I try to compile this saying that path.to_str() doesn't live long enough. I get that the value returned from to_str lives for the length of the match scope but how can you return a value from a match statement that has to call another function?

path.to_str() will return a &str reference to the inner string contained in path, which will only live as long as path, that is inside the match arm.
You can use to_owned to get an owned copy of that &str. You will have to adapt the value from clap accordingly to have the same types in both branches of your if:
let some_dir = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap().to_owned()
} else {
match env::home_dir() {
Some(path) => path.to_str().unwrap().to_owned(),
None => panic!("Uh, oh!"),
}
};
Alternatively, you could use Cow to avoid the copy in the first branch:
use std::borrow::Cow;
let some_dir: Cow<str> = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap().into()
} else {
match env::home_dir() {
Some(path) => path.to_str().unwrap().to_owned().into(),
None => panic!("Uh, oh!"),
}
};

What is happening is that the scope of the match statement takes ownership of the PathBuf object returned from env::home_dir(). You then attempt to return a reference to that object, but the object ceases to exist immediately.
The solution is to return PathBuf rather than a reference to it (or convert it to a String and return that instead, in any case, it has to be some type that owns the data). You may have to change what matches.value_of("some_dir").unwrap() returns so that both branches return the same type.

There is a rather simple trick: increase the scope of path (and thus its lifetime) so that you can take a reference into it.
use std::env;
fn main() {
// Do some argument parsing stuff...
let path; // <--
let some_dir = if matches.is_present("some_dir") {
matches.value_of("some_dir").unwrap()
} else {
match env::home_dir() {
Some(p) => { path = p; path.to_str().unwrap() },
None => panic!("Uh, oh!"),
}
};
// Do more things
}
It is efficient, as path is only ever used when necessary, and does not require changing the types in the program.
Note: I added an .unwrap() after .to_str() because .to_str() returns an Option. And do note that the reason it returns an Option<&str> is because not all paths are valid UTF-8 sequences. You might want to stick to Path/PathBuf.

Related

Passing variables into command line arguments [duplicate]

I'm about to return a string depending the given argument.
fn hello_world(name:Option<String>) -> String {
if Some(name) {
return String::formatted("Hello, World {}", name);
}
}
This is a not available associated function! - I wanted to make clear what I want to do. I browsed the doc already but couldn't find any string builder functions or something like that.
Use the format! macro:
fn hello_world(name: Option<&str>) -> String {
match name {
Some(n) => format!("Hello, World {n}"),
None => format!("Who are you?"),
}
}
In Rust, formatting strings uses the macro system because the format arguments are typechecked at compile time, which is implemented through a procedural macro.
There are other issues with your code:
You don't specify what to do for a None - you can't just "fail" to return a value.
The syntax for if is incorrect, you want if let to pattern match.
Stylistically, you want to use implicit returns when it's at the end of the block.
In many (but not all) cases, you want to accept a &str instead of a String.
See also:
Is there a way to pass named arguments to format macros without repeating the variable names?
Since Rust 1.58 it's possible to use named parameters, too.
fn hello_world(name: Option<&str>) -> String {
match name {
Some(n) => format!("Hello, World {n}"),
None => format!("Who are you?"),
}
}

cannot return value referencing local data

I'm new to rust. The get_x509 function below creates a compiler warning "cannot return value referencing local data pem.contents" . I think I understand why - because the return value references pem.contents which is only in scope for that function - but I've not been able to work out how to get it to work.
The x509 functions in the code below come from the x509_parser crate
use x509_parser::prelude::*;
fn main() {
let cert = "";
get_x509(cert);
}
fn get_x509(cert: &str) -> X509Certificate {
let res_pem = parse_x509_pem(cert.as_bytes());
let x509_cert = match res_pem {
Ok((_, pem)) => {
let res_cert = parse_x509_certificate(&pem.contents);
match res_cert {
Ok((_, certificate)) => certificate,
Err(_err) => {
panic!("Parse failed")
}
}
}
Err(_err) => {
panic!("Parse failed")
}
};
return x509_cert;
}
I've tried making the cert variable a static value. If I inline the above code in the main() function, it works (but I have to match on &res_pem instead of res_pem).
According to x509-parser-0.14.0/src/certificate.rs, both the parameter and the return value of parse_x509_certificate have a lifetime 'a associated with them. One way to solve the problem is to divide get_x509 into two functions, and you can somehow avoid local reference in the second function which calls parse_x509_certificate.
The following code compiles (but will panic at runtime since cert is empty):
fn main() {
let cert = "";
let pem = get_x509_pem(cert);
get_x509(&pem); // Return value is unused.
}
use x509_parser::prelude::*;
fn get_x509_pem(cert: &str) -> Pem {
let (_, pem) = parse_x509_pem(cert.as_bytes()).expect("Parse failed");
pem
}
fn get_x509(pem: &Pem) -> X509Certificate {
let x509_cert = match parse_x509_certificate(&pem.contents) {
Ok((_, certificate)) => certificate,
Err(_err) => {
panic!("Parse failed")
}
};
x509_cert
}
The issue here, just as you've said, is that you have something that only lives in the context of the function, and you want to return a reference to it (or to some parts of it). But when the function execution is finished, the underlying data is removed, hence you would return a dangling reference - and Rust prevents this.
The way to go around this is (well illustrated by the answer of Simon Smith) to return the data you'd like to reference instead of just returning the reference. So in your case, you want to return the whole resp_pem object and then do any further data extraction.
Reading the documentation of the library, you seem to be in an unlucky situation where there is no way around moving res_pem out of the function into a static space, since parse_x509_pem returns owned data, and X509Certificate contains references. Hence, the lifetime of the returned certificate has to outlive the function, but the object you reference (res_pem) is owned by the function and is removed when the execution of the function is finished.

Parse variables as 1 argument [duplicate]

I'm about to return a string depending the given argument.
fn hello_world(name:Option<String>) -> String {
if Some(name) {
return String::formatted("Hello, World {}", name);
}
}
This is a not available associated function! - I wanted to make clear what I want to do. I browsed the doc already but couldn't find any string builder functions or something like that.
Use the format! macro:
fn hello_world(name: Option<&str>) -> String {
match name {
Some(n) => format!("Hello, World {n}"),
None => format!("Who are you?"),
}
}
In Rust, formatting strings uses the macro system because the format arguments are typechecked at compile time, which is implemented through a procedural macro.
There are other issues with your code:
You don't specify what to do for a None - you can't just "fail" to return a value.
The syntax for if is incorrect, you want if let to pattern match.
Stylistically, you want to use implicit returns when it's at the end of the block.
In many (but not all) cases, you want to accept a &str instead of a String.
See also:
Is there a way to pass named arguments to format macros without repeating the variable names?
Since Rust 1.58 it's possible to use named parameters, too.
fn hello_world(name: Option<&str>) -> String {
match name {
Some(n) => format!("Hello, World {n}"),
None => format!("Who are you?"),
}
}

Re-using values without declaring variables

In Kotlin, I can re-use values so:
"127.0.0.1:135".let {
connect(it) ?: System.err.println("Failed to connect to $it")
}
Is anything similar possible in Rust? To avoid using a temporary variable like this:
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address).expect(format!("Failed to connect to {}", text_address));
According to this reference, T.let in Kotlin is a generic method-like function which runs a closure (T) -> R with the given value T passed as the first argument. From this perspective, it resembles a mapping operation from T to R. Under Kotlin's syntax though, it looks like a means of making a scoped variable with additional emphasis.
We could do the exact same thing in Rust, but it doesn't bring anything new to the table, nor makes the code cleaner (using _let because let is a keyword in Rust):
trait LetMap {
fn _let<F, R>(self, mut f: F) -> R
where
Self: Sized,
F: FnMut(Self) -> R,
{
f(self)
}
}
impl<T> LetMap for T {}
// then...
"something"._let(|it| {
println!("it = {}", it);
"good"
});
When dealing with a single value, it is actually more idiomatic to just declare a variable. If you need to constrain the variable (and/or the value's lifetime) to a particular scope, just place it in a block:
let conn = {
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address)?
};
There is also one more situation worth mentioning: Kotlin has an idiom for nullable values where x?.let is used to conditionally perform something when the value isn't null.
val value = ...
value?.let {
... // execute this block if not null
}
In Rust, an Option already provides a similar feature, either through pattern matching or the many available methods with conditional execution: map, map_or_else, unwrap_or_else, and_then, and more.
let value: Option<_> = get_opt();
// 1: pattern matching
if let Some(non_null_value) = value {
// ...
}
// 2: functional methods
let new_opt_value: Option<_> = value.map(|non_null_value| {
"a new value"
}).and_then(some_function_returning_opt);
This is similar
{
let text_address = "127.0.0.1:135";
TcpListener::bind(text_address).expect(format!("Failed to connect to {}", text_address));
}
// now text_address is out of scope

Is there a way to modify how I'm using variables in a loop so that the order I'm initializing them does not matter?

I'm fairly new to Rust, so not entirely sure how to properly title the question because I don't fully understand the error. I have the following simplified code which I'm using to parse command line arguments:
use std::env;
fn main() {
let mut script: &str = "";
// Get the commandline arguments.
let args: Vec<String> = env::args().collect();
// Loop through and handle the commandline arguments.
// Skip the first argument; it's the name of the program.
for arg in args.iter().skip(1) {
let split: Vec<&str> = arg.trim().split("=").collect();
if split.len() == 2 {
match split[0]{
"file" => { script = split[1]; }
_ => { println!("Invalid parameter: {}", arg); }
}
} else {
println!("Invalid parameter: {}", arg);
println!("Parameters should consist of a parameter name and value separated by '='");
}
}
}
Which gives me the following error:
error: `args` does not live long enough
--> src/main.rs:25:1
|
12 | for arg in args.iter().skip(1) {
| ---- borrow occurs here
...
25 | }
| ^ `args` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
When I change where the script variable is initialized:
use std::env;
fn main() {
// Get the commandline arguments.
let args: Vec<String> = env::args().collect();
let mut script: &str = "";
// Loop through and handle the commandline arguments.
// Skip the first argument; it's the name of the program.
for arg in args.iter().skip(1) {
let split: Vec<&str> = arg.trim().split("=").collect();
if split.len() == 2 {
match split[0]{
"file" => { script = split[1]; }
_ => { println!("Invalid parameter: {}", arg); }
}
} else {
println!("Invalid parameter: {}", arg);
println!("Parameters should consist of a parameter name and value separated by '='");
}
}
}
The error goes away. Based on the error and how the order that the variables are being initialized changes things, I think I'm making a fundamental mistake in how I'm using (borrowing?) the variables in the loop, but I'm not entirely sure what I'm doing wrong and the proper way to fix it. Is there a way to modify how I'm using the variables in the loop so that the order I'm initializing them does not matter?
Short answer: script refers to a portion of one of the strings allocated by env::args(). If you define script before args, then args is dropped first (as the compiler's message notes, "values are dropped in opposite order") and script points to deallocated memory. Your fix, defining the script after the args object, is correct.
To answer the edited question: the order of variables does matter when one of them is a reference to the other, and you are not allowed to change them arbitrarily. For an explanation of why that is so, read on.
In Rust, every reference is associated with a lifetime, the scope during which the reference is valid. To take an example from the book, lifetimes are what prevents the following from compiling (and crashing):
let r;
{
let x = 5;
r = &x;
}
println!("r: {}", r); // doesn't compile - x doesn't live long enough
In many cases, lifetimes are inferred automatically. For example, the following are equivalent:
{
let x = "foo";
let y: &str = "foo";
let z: &'static str = "foo";
}
i.e. compiler will infer the static lifetime given the use of a string constant, which is statically allocated and exists during the entire execution of the program. On the other hand, the following uses a narrower lifetime:
// correct
let s = "foo".to_owned(); // allocate "foo" dynamically
let sref = s.as_str(); // points to heap-allocated "foo"
...
Here, sref is only valid for as long as s is valid. After dropping or mutating s, sref would point to uninitialized memory, which Rust carefully prevents. Inserting extra braces sometimes helps visualize the scopes:
// correct - sref cannot outlive s
let s = "foo".to_owned();
{
let sref = s.as_str();
...
}
On the other hand, if you write them backwards, it doesn't compile:
// incorrect, doesn't compile
let mut sref = "";
let s = "foo".to_string();
sref = s.as_str();
To see why, let's insert more explicit scopes:
// incorrect, doesn't compile
{
let mut sref = "";
{
let s = "foo".to_string();
sref = s.as_str();
}
// <-- here sref outlives s
}
This is essentially the same as the example from the book, and it is obviously not allowed to compile! And now it should be a bit clearer what the compiler means by "values in a scope are dropped in the opposite order they are created". The fact that s is declared after sref means that it is effectively nested in an inner scope, which is why it will be dropped before the stuff in the outer scopes. sref referring to anything in s means that after the inner scope, sref is pointing to uninitialized memory.
To get back to your code, env::args() returns an Args object whose Iterator implementation yields dynamically allocated Strings. Although you start off by assigning a static &str to script, the lifetime of the script reference is determined as an intersection of the scopes of all assigned values. In this case these are the static scope from the first assignment and the scope of args from the second assignment, and their intersection is the args scope, which ends up being used as the reference lifetime. Moving script declaration after args places the script reference into an inner scope compared, ensuring that it always refers to a live object.
Is there a way to modify how I'm using variables in a loop so that the order I'm initializing them does not matter?
Yes, you can avoid borrowing at all by cloning the value:
use std::env;
fn main() {
let mut script = None;
for arg in env::args().skip(1) {
let mut parts = arg.trim().splitn(2, "=").fuse();
match (parts.next(), parts.next()) {
(Some("file"), Some(name)) => script = Some(name.to_owned()),
(Some(other), Some(_)) => {
println!("Invalid parameter: {}", other);
}
(Some(other), None) => {
println!("Invalid parameter: {}", other);
println!("Parameters should consist of a parameter name and value separated by '='");
}
(None, _) => {}
}
}
let script = script.expect("file is a required parameter");
do_thing_with_script(&script);
}
fn do_thing_with_script(_script: &str) {}
This also avoids allocating multiple Vecs. There's also a theoretical/potential memory savings as we don't have to keep the entire argument string in memory, just the parameter. On the flip side, there's a bit more allocation.
Profiling is always the right path, but it has yet to be my experience that command line processing is a large resource usage of a program. To that end, I advocate doing whichever route makes your code easiest to understand, maintain, and which gives your end users the best experience.
Usually that means using a library.
If you have your heart set on borrowing, then user4815162342's answer explains why you have to have the thing you are borrowing from outlive the borrow.

Resources