I'm new to Rust and have the following working code. But I don't think what I am doing is the best way. I'm looking for insights about this piece of simple code.
I have a simple struct that holds some data:
struct BlobRef {
hashname: String,
}
impl BlobRef {
fn hashname(self) -> String {
self.hashname
}
}
And a function call. Don't worry about the source: &[u8], it will have its time to shine.
fn write(bref: BlobRef, source: &[u8]) -> io::Result<String> {
let hashname = bref.hashname();
match fs::create_dir_all(&hashname) {
Ok(_) => Ok(hashname),
Err(e) => Err(e)
}
}
I need to assign another scoped variable hashname to stop the compiler complaining about "use of moved variable". Is this idiomatic?
This is not a question whether this is idiomatic or not, I believe; what you need to write depends on what you want to achieve.
Using by-value self and moving String out of the struct, consuming it in process (this is what your example does) is perfectly legitimate thing in certain contexts and wholly depends on your use cases.
On the other hand, if you want only to get the value of the string to read it somewhere (as your example suggests), it is better to return a string slice:
impl BlobRef {
fn hashname(&self) -> &str {
&self.hashname
}
}
Now your second piece code could look like this:
fn write(bref: &BlobRef, source: &[u8]) -> io::Result<String> {
let hashname = bref.hashname();
match fs::create_dir_all(hashname) {
Ok(_) => Ok(hashname.into()),
Err(e) => Err(e)
}
}
This require additional allocation in order to get a String out of &str.
However, if the only purpose of BlobRef is to transfer the string to this function, then your original approach is perfectly fine.
Related
With the Rust project I am working on I would like to keep the code as clean as I can but was having issues with side effects - more specifically how to communicate whether they have been successful or not. Assume we have this enum and struct (the struct may contain more members not relevant to my question):
enum Value{
Mutable(i32),
Constant(i32),
}
struct SomeStruct {
// --snip--
value: Value
}
I would like to have a function to change value:
impl SomeStruct {
fn change_value(&mut self, new_value: i32) {
match self.value {
Value::Mutable(_) => self.value = Value::Mutable(new_value),
Value::Constant(_) => (), /* meeeep! do not do this :( */
}
}
}
I am now unsure how to cleanly handle the case where value is Value::Constant.
From what I've learned in C or C++ you would just return a bool and return true when value was successfully changed or false when it wasn't. This does not feel satisfying as the function signature alone would not make it clear what the bool was for. Also, it would make it optional for the caller of change_value to just ignore the return value and not handle the case where the side effect did not actually happen. I could, of course, adjust the function name to something like try_changing_value but that feels more like a band-aid than actually fixing the issue.
The only alternative I know would be to approach it more "functionally":
fn change_value(some_struct: SomeStruct, new_value: i32) -> Option<SomeStruct> {
match self.value {
Value::Mutable(_) => {
let mut new_struct = /* copy other members */;
new_struct.value = Value::Mutable(new_value);
Some(new_struct)
},
Value::Constant(_) => None,
}
}
However, I imagine if SomeStruct is expensive to copy this is not very efficient. Is there another way to do this?
Finally, to give some more context as to how I got here: My actual code is about solving a sudoku and I modelled a sudoku's cell as either having a given (Value::Constant) value or a guessed (Value::Mutable) value. The "given" values are the once you get at the start of the puzzle and the "guessed" values are the once you fill in yourself. This means changing "given" values should not be allowed. If modelling it this way is the actual issue, I would love to hear your suggestions on how to do it differently!
The general pattern to indicate whether something was successful or not is to use Result:
struct CannotChangeValue;
impl SomeStruct {
fn change_value(&mut self, new_value: i32) -> Result<(), CannotChangeValue> {
match self.value {
Value::Mutable(_) => {
self.value = Value::Mutable(new_value);
Ok(())
}
Value::Constant(_) => Err(CannotChangeValue),
}
}
}
That way the caller can use the existing methods, syntax, and other patterns to decide how to deal with it. Like ignore it, log it, propagate it, do something else, etc. And the compiler will warn that the caller will need to do something with the result (even if that something is to explicitly ignore it).
If the API is designed to let callers determine exactly how to mutate the value, then you may want to return Option<&mut i32> instead to indicate: "I may or may not have a value that you can mutate, here it is." This also has a wealth of methods and tools available to handle it.
I think that Result fits your use-case better, but it just depends on the flexibility and level of abstraction that you're after.
For the sake of completeness with kmdreko's answer, this is the way you would implement a mut-ref-getter, which IMO is the simpler and more flexible approach:
enum Value {
Mutable(i32),
Constant(i32),
}
impl Value {
pub fn get_mut(&mut self) -> Option<&mut i32> {
match self {
Value::Mutable(ref mut v) => Some(v),
Value::Constant(_) => None,
}
}
}
Unlike the Result approach, this forces the caller to consider that setting the value may not be possible. (Granted, Result is a must_use type, so they'd get a warning if discarding it.)
You can write a proxy method on SomeStruct that forwards the invocation to this method:
impl SomeStruct {
pub fn get_mut_value(&mut self) -> Option<&mut i32> {
self.value.get_mut()
}
}
I am working on a system which produces and consumes large numbers of "events", they are a name with some small payload of data, and an attached function which is used as a kind of fold-left over the data, something like a reducer.
I receive from the upstream something like {t: 'fieldUpdated', p: {new: 'new field value'}}, and must in my program associate the fieldUpdated "callback" function with the incoming event and apply it. There is a confirmation command I must echo back (which follows a programatic naming convention), and each type is custome.
I tried using simple macros to do codegen for the structs, callbacks, and with the paste::paste! macro crate, and with the stringify macro I made quite good progress.
Regrettably however I did not find a good way to metaprogram these into a list or map using macros. Extending an enum through macros doesn't seem to be possible, and solutions such as the use of ctors seems extremely hacky.
My ideal case is something this:
type evPayload = {
new: String
}
let evHandler = fn(evPayload: )-> Result<(), Error> { Ok(()) }
// ...
let data = r#"{"t": 'fieldUpdated', "p": {"new": 'new field value'}}"#'
let v: Value = serde_json::from_str(data)?;
Given only knowledge of data how can use macros, specifically (boilerplate is actually 2-3 types, 3 functions, some factory and helper functions) in a way that I can do a name-to-function lookup?
It seems like Serde's adjacently, or internally tagged would get me there, if I could modify a enum in a macro https://serde.rs/enum-representations.html#internally-tagged
It almost feels like I need a macro which can either maintain an enum, or I can "cheat" and use module scoped ctors to do a quasi-static initialization of the names and types into a map.
My program would have on the order of 40-100 of these, with anything from 3-10 in a module. I don't think ctors are necessarily a problem here, but the fact that they're a little grey area handshake, and that ctors might preclude one day being able to cross-compile to wasm put me off a little.
I actually had need of something similar today; the enum macro part specifically. But beware of my method: here be dragons!
Someone more experienced than me — and less mad — should probably vet this. Please do not assume my SAFETY comments to be correct.
Also, if you don't have variant that collide with rust keywords, you might want to tear out the '_' prefix hack entirely. I used a static mut byte array for that purpose, as manipulating strings was an order of magnitude slower, but that was benchmarked in a simplified function. There are likely better ways of doing this.
Finally, I am using it where failing to parse must cause panic, so error handling somewhat limited.
With that being said, here's my current solution:
/// NOTE: It is **imperative** that the length of this array is longer that the longest variant name +1
static mut CHECK_BUFF: [u8; 32] = [b'_'; 32];
macro_rules! str_enums {
($enum:ident: $($variant:ident),* $(,)?) => {
#[allow(non_camel_case_types)]
#[derive(Debug, Default, Hash, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum $enum {
#[default]
UNINIT,
$($variant),*,
UNKNOWN
}
impl FromStr for $enum {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
unsafe {
// SAFETY: Currently only single threaded
CHECK_BUFF[1..len].copy_from_slice(s.as_bytes());
let len = s.len() + 1;
assert!(CHECK_BUFF.len() >= len);
// SAFETY: Safe as long as CHECK_BUFF.len() >= s.len() + 1
match from_utf8_unchecked(&CHECK_BUFF[..len]) {
$(stringify!($variant) => Ok(Self::$variant),)*
_ => Err(format!(
"{} variant not accounted for: {s} ({},)",
stringify!($enum),
from_utf8_unchecked(&CHECK_BUFF[..len])
))
}
}
}
}
impl From<&$enum> for &'static str {
fn from(variant: &$enum) -> Self {
unsafe {
match variant {
// SAFETY: The first byte is always '_', and stripping it of should be safe.
$($enum::$variant => from_utf8_unchecked(&stringify!($variant).as_bytes()[1..]),)*
$enum::UNINIT => {
eprintln!("uninitialized {}!", stringify!($enum));
""
}
$enum::UNKNOWN => {
eprintln!("unknown {}!", stringify!($enum));
""
}
}
}
}
}
impl Display for $enum {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", Into::<&str>::into(self))
}
}
};
}
And then I call it like so:
str_enums!(
AttributeKind:
_alias,
_allowduplicate,
_altlen,
_api,
...
_enum,
_type,
_struct,
);
str_enums!(
MarkupKind:
_alias,
_apientry,
_command,
_commands,
...
);
I would like to have the details of an error be propagated upwards. I used error-chain previously, but that has not been maintained or kept compatible with the rest of the ecosystem as far as i can tell.
For example, in this example:
use std::str::FromStr;
use anyhow::Result;
fn fail() -> Result<u64> {
Ok(u64::from_str("Some String")?)
}
fn main() {
if let Err(e) = fail(){
println!("{:?}", e);
}
The error i am getting is:
invalid digit found in string
I would need the error message to have the key details, including at the point of failure, for example:
- main: invalid digit found in string
- fail: "Some String" is not a valid digit
What's the best way of doing this?
anyhow provides the context() and with_context() methods for that:
use anyhow::{Context, Result};
use std::str::FromStr;
fn fail() -> Result<u64> {
let s = "Some String";
Ok(u64::from_str(s).with_context(|| format!("\"{s}\" is not a valid digit"))?)
}
fn main() {
if let Err(e) = fail() {
println!("{:?}", e);
}
}
"Some String" is not a valid digit
Caused by:
invalid digit found in string
If you want custom formatting, you can use the Error::chain() method:
if let Err(e) = fail() {
for err in e.chain() {
println!("{err}");
}
}
"Some String" is not a valid digit
invalid digit found in string
And if you want additional details (e.g. where the error happened), you can use a custom error type and downcast it (for error source you can also capture a backtrace).
This is a tricky thing to accomplish and I'm not sure that there is a simple and non-invasive way to capture all of the details of any possible error without knowledge of the particular function being invoked. For example, we may want to display some arguments to the function call that failed, but evaluating other arguments might be problematic -- they may not even be able to be turned into strings.
Maybe the argument is another function call, too, so should we capture its arguments or only its return value?
I whipped up this example quickly to show that we can at least fairly trivially capture the exact source expression. It provides a detail_error! macro that takes an expression that produces Result<T, E> and emits an expression that procudes Result<T, DetailError<E>>. The DetailError wraps the original error value and additionally contains a reference to a string of the original source code fed to the macro.
use std::error::Error;
use std::str::FromStr;
#[derive(Debug)]
struct DetailError<T: Error> {
expr: &'static str,
cause: T,
}
impl<T: Error> DetailError<T> {
pub fn new(expr: &'static str, cause: T) -> DetailError<T> {
DetailError { expr, cause }
}
// Some getters we don't use in this example, but should be present to have
// a complete API.
#[allow(dead_code)]
pub fn cause(&self) -> &T {
&self.cause
}
#[allow(dead_code)]
pub fn expr(&self) -> &'static str {
self.expr
}
}
impl<T: Error> Error for DetailError<T> { }
impl<T: Error> std::fmt::Display for DetailError<T> {
fn fmt(&self, f: &mut std::fmt::Formatter) -> Result<(), std::fmt::Error> {
write!(f, "While evaluating ({}): ", self.expr)?;
std::fmt::Display::fmt(&self.cause, f)
}
}
macro_rules! detail_error {
($e:expr) => {
($e).map_err(|err| DetailError::new(stringify!($e), err))
}
}
fn main() {
match detail_error!(u64::from_str("Some String")) {
Ok(_) => {},
Err(e) => { println!("{}", e); }
};
}
This produces the runtime output:
While evaluating (u64::from_str("Some String")): invalid digit found in string
Note that this only shows the string because it's a literal in the source. If you pass a variable/parameter instead, you will see that identifier in the error message instead of the string.
When you run your app with the environment variable RUST_BACKTRACE set to 1 or full, you'll get more error details, without the need to recompile your program. That, however, doesn't mean you're going to get an extra message like "Some String" is not a valid digit, as the parsing function simply doesn't generate such.
Result.expect()'s console output wasn't what I needed, so I extended Result with my own version:
trait ResultExt<T> {
fn or_exit(self, message: &str) -> T;
}
impl<T> ResultExt<T> for ::std::result::Result<T, Error> {
fn or_exit(self, message: &str) -> T {
if self.is_err() {
io::stderr().write(format!("FATAL: {} ({})\n", message, self.err().unwrap()).as_bytes()).unwrap();
process::exit(1);
}
return self.unwrap();
}
}
As I understand, Rust doesn't support varargs yet, so I have to use it like that, correct?
something().or_exit(&format!("Ah-ha! An error! {}", "blah"));
That's too verbose compared to either Java, Kotlin or C. What is the preferred way to solve this?
I don't think the API you suggested is particularly unergonomic. If maximum performance matters, it might make sense to put the error generation in a closure or provide an API for that too, so the String is only allocated when there is actually an error, which might be especially relevant when something is particularly expensive to format. (Like all the _else methods for std::result::Result.)
However, you might be able to make it more ergonomic by defining a macro which takes a result, a &str and format parameters. This could look like this for example: (This is based on #E_net4's comment)
macro_rules! or_exit {
($res:expr, $fmt:expr, $($arg:tt)+) => {
$res.unwrap_or_else(|e| {
let message = format!($fmt, $($arg)+);
eprintln!("FATAL: {} ({})\n", message, e);
process::exit(1)
})
};
}
fn main() {
let x: Result<i32, &'static str> = Err("dumb user, please replace");
let _ = or_exit!(x, "Ah-ha! An error! {}", "blahh");
}
Rust Playground
Note this might not yield the best error messages if users supply invalid arguments, I did not want to change your code too much, but if you decide to actually have the macro only be sugar and nothing else you should probably extend your API to take a closure instead of a string. You might want also to reconsider the naming of the macro.
I'm starting to get comfortable with Rust, but there are still some things that are really tripping me up with lifetimes. In this particular case, what I want to do is have an enum which may have different types wrapped as a generic parameter class to create strongly typed query parameters in a URL, though the specific use case is irrelevant, and return a conversion of that wrapped value into an &str. Here's an example of what I want to do:
enum Param<'a> {
MyBool(bool),
MyLong(i64),
MyStr(&'a str),
}
impl<'a> Param<'a> {
fn into(self) -> (&'static str, &'a str) {
match self {
Param::MyBool(b) => ("my_bool", &b.to_string()), // clearly wrong
Param::MyLong(i) => ("my_long", &i.to_string()), // clearly wrong
Param::Value(s) => ("my_str", s),
}
}
}
What I ended up doing is this to deal with the obvious lifetime issue (and yes, it's obvious to me why the lifetime isn't long enough for the into() function):
enum Param<'a> {
MyBool(&'a str), // no more static typing :(
MyLong(&'a str), // no more static typing :(
MyStr(&'a str),
}
impl<'a> Param<'a> {
fn into(self) -> (&'static str, &'a str) {
match self {
Param::MyBool(b) => ("my_bool", b),
Param::MyLong(i) => ("my_long", i),
Param::Value(s) => ("my_str", s),
}
}
}
This seems like an ugly workaround in a case where what I really want to do is guarantee the static typing of certain params, b/c now it's the constructor of the enum that's responsible for the proper type conversion. Curious if there is a way to do this... and yes, at some point I need &str as that is a parameter elsewhere, specifically:
let body = url::form_urlencoded::serialize(
vec![Param::MyBool(&true.to_string()).
into()].
into_iter());
I went through a whole bunch of things like trying to return String instead of &str from into(), but that only caused conversion issues down the line with a map() of String -> &str. Having the tuple correct from the start is the easiest thing, rather than fighting the compiler at every turn after that.
-- update--
Ok, so I went back to a (String,String) tuple in the into() function for the enum. It turns out that there is an "owned" version of the url::form_urlencoded::serialize() function which this is compatible with.
pub fn serialize_owned(pairs: &[(String, String)]) -> String
But, now I'm also trying to use the same pattern for the query string in the hyper::URL, specifically:
fn set_query_from_pairs<'a, I>(&mut self, pairs: I)
where I: Iterator<Item=(&'a str, &'a str)>
and then I try to use map() on the iterator that I have from the (String,String) tuple:
params: Iterator<Item=(String, String)>
url.set_query_from_pairs(params.map(|x: (String, String)| ->
(&str, &str) { let (ref k, ref v) = x; (k, v) } ));
But this gets error: x.0 does not live long enough. Ref seems correct in this case, right? If I don't use ref, then it's k/v that don't live long enough. Is there something 'simple' that I'm missing in this?
It is not really clear why you can't do this:
enum Param<'a> {
MyBool(bool),
MyLong(i64),
MyStr(&'a str),
}
impl<'a> Param<'a> {
fn into(self) -> (&'static str, String) {
match self {
Param::MyBool(b) => ("my_bool", b.to_string()),
Param::MyLong(i) => ("my_long", i.to_string()),
Param::MyStr(s) => ("my_str", s.into()),
}
}
}
(into() for &str -> String conversion is slightly more efficient than to_string())
You can always get a &str from String, e.g. with deref coercion or explicit slicing.