Is mem::forget(mem::uninitialized()) defined behavior? - rust

In mutagen, I'm injecting various
mutations in the code. One thing I'd like to mutate is the pattern
if let Ok(x) = y { .. }. However, this poses quite the challenge, as I
cannot know the type of y – the user could have built their own enum with an
unary Ok variant. I can still opportunistically mutate it for cases where we
actually have a Result whose error type implements Default using a trait
that looks like the following simplified:
#![feature(specialization)]
pub trait Errorer {
fn err(self, mutate: bool) -> Self;
}
impl<X> Errorer for X {
default fn err(self, _mutate: bool) -> Self {
self
}
}
impl<T, E> Errorer for Result<T, E>
where
E: Default,
{
fn err(self, mutate: bool) -> Self {
if mutate {
Err(Default::default())
} else {
self
}
}
}
Alas, there aren't that many errors which implement Default, so this is
not too useful. Even an implementation for Result<T, Box<Error>> would give
us more bang for the buck (and be completely possible). However, given that I
don't care much about code actually inspecting the error, I wonder if I could
do a general implementation by extending the mutation of the above code to
match Errorer::err(y, mutation) {
Ok(x) => { .. }
Err(x) => { mem::forget(x); }
}
and have err return Err(mem::uninitialized()) when mutating – so is this
behavior safe? Note: I'm returning Err(mem::uninitialized()) from a method,
only to mem::forget it later. I see no way this could panic, so we should
assume that the value will be indeed forgotten.
Is this defined behavior or should I expect nasal demons?

No, this is not defined behavior, at least not for all types. (I can't tell how your code would be called as part of mutation, so I don't know if you have control over the types here, but the generic impl sure makes it look like you do not.) That's demonstrated by the following piece of code:
#![feature(never_type)]
use std::mem;
fn main() {
unsafe { mem::forget(mem::uninitialized::<!>()) }
}
If you run this on the playground, you will see the program die with a SIGILL. The ASM output shows that LLVM just optimized the entire program to immediate SIGILL because of the way it uses a value of the uninhabited type !:
playground::main:
ud2
Generally speaking, it is near impossible to correctly use mem::uninitialized in generic code, see e.g. this issue of rc::Weak. For this reason, that function is in the process of being deprecated and replaced. But that won't help you here; what you want to do is just outright illegal for Result<T, !>.

Related

How can I communicate whether side effects have been successful or not?

With the Rust project I am working on I would like to keep the code as clean as I can but was having issues with side effects - more specifically how to communicate whether they have been successful or not. Assume we have this enum and struct (the struct may contain more members not relevant to my question):
enum Value{
Mutable(i32),
Constant(i32),
}
struct SomeStruct {
// --snip--
value: Value
}
I would like to have a function to change value:
impl SomeStruct {
fn change_value(&mut self, new_value: i32) {
match self.value {
Value::Mutable(_) => self.value = Value::Mutable(new_value),
Value::Constant(_) => (), /* meeeep! do not do this :( */
}
}
}
I am now unsure how to cleanly handle the case where value is Value::Constant.
From what I've learned in C or C++ you would just return a bool and return true when value was successfully changed or false when it wasn't. This does not feel satisfying as the function signature alone would not make it clear what the bool was for. Also, it would make it optional for the caller of change_value to just ignore the return value and not handle the case where the side effect did not actually happen. I could, of course, adjust the function name to something like try_changing_value but that feels more like a band-aid than actually fixing the issue.
The only alternative I know would be to approach it more "functionally":
fn change_value(some_struct: SomeStruct, new_value: i32) -> Option<SomeStruct> {
match self.value {
Value::Mutable(_) => {
let mut new_struct = /* copy other members */;
new_struct.value = Value::Mutable(new_value);
Some(new_struct)
},
Value::Constant(_) => None,
}
}
However, I imagine if SomeStruct is expensive to copy this is not very efficient. Is there another way to do this?
Finally, to give some more context as to how I got here: My actual code is about solving a sudoku and I modelled a sudoku's cell as either having a given (Value::Constant) value or a guessed (Value::Mutable) value. The "given" values are the once you get at the start of the puzzle and the "guessed" values are the once you fill in yourself. This means changing "given" values should not be allowed. If modelling it this way is the actual issue, I would love to hear your suggestions on how to do it differently!
The general pattern to indicate whether something was successful or not is to use Result:
struct CannotChangeValue;
impl SomeStruct {
fn change_value(&mut self, new_value: i32) -> Result<(), CannotChangeValue> {
match self.value {
Value::Mutable(_) => {
self.value = Value::Mutable(new_value);
Ok(())
}
Value::Constant(_) => Err(CannotChangeValue),
}
}
}
That way the caller can use the existing methods, syntax, and other patterns to decide how to deal with it. Like ignore it, log it, propagate it, do something else, etc. And the compiler will warn that the caller will need to do something with the result (even if that something is to explicitly ignore it).
If the API is designed to let callers determine exactly how to mutate the value, then you may want to return Option<&mut i32> instead to indicate: "I may or may not have a value that you can mutate, here it is." This also has a wealth of methods and tools available to handle it.
I think that Result fits your use-case better, but it just depends on the flexibility and level of abstraction that you're after.
For the sake of completeness with kmdreko's answer, this is the way you would implement a mut-ref-getter, which IMO is the simpler and more flexible approach:
enum Value {
Mutable(i32),
Constant(i32),
}
impl Value {
pub fn get_mut(&mut self) -> Option<&mut i32> {
match self {
Value::Mutable(ref mut v) => Some(v),
Value::Constant(_) => None,
}
}
}
Unlike the Result approach, this forces the caller to consider that setting the value may not be possible. (Granted, Result is a must_use type, so they'd get a warning if discarding it.)
You can write a proxy method on SomeStruct that forwards the invocation to this method:
impl SomeStruct {
pub fn get_mut_value(&mut self) -> Option<&mut i32> {
self.value.get_mut()
}
}

Type handling in the trait object

I'm new to rust and I recently ran into a problem with trait
I have a trait that is used as the source of a message and is stored in a structure as a Box trait object. I simplified my logic and the code looks something like this.
#[derive(Debug)]
enum Message {
MessageTypeA(i32),
MessageTypeB(f32),
}
enum Config {
ConfigTypeA,
ConfigTypeB,
}
trait Source {
fn next(&mut self) -> Message;
}
struct SourceA;
impl Source for SourceA {
fn next(&mut self) -> Message {
Message::MessageTypeA(1)
}
}
struct SourceB;
impl Source for SourceB {
fn next(&mut self) -> Message {
Message::MessageTypeB(1.1)
}
}
struct Test {
source: Box<dyn Source>,
}
impl Test {
fn new(config: Config) -> Self {
Test {
source: match config {
Config::ConfigTypeA => Box::new(SourceA{}),
Config::ConfigTypeB => Box::new(SourceB{}),
}
}
}
fn do_sth(&mut self) -> String {
match self.source.next() {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
fn do_sth_else(&mut self, message: Message) -> String {
match message {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
}
Different types of Source return different types of Message, the Test structure needs to create the corresponding trait object according to config and call next() in the do_sth function.
So you can see two enum types Config and Message, which I feel is a strange usage, but I don't know what's strange about it.
I tried to use trait association type, but then I need to specify the association type when I declare the Test structure like source: Box<dyn Source<Item=xxxx>> but I don't actually know the exact type when creating the struct object.
Then I tried to use Generic type, but because of the need of the upper code, Test cannot use Generic.
So please help me, is there a more elegant or rustic solution to this situation?
So what might be going on here is that you're mis-using the idea of trait objects. The whole idea of a trait object is that you want to write code that relies purely on its interface. As soon as you find yourself checking for the underlying type of a trait object, that should raise a red flag.
Of course in your example you're not using any sort of weird run-time type checking; instead, you're checking the type implicitly via the enums.
Still, you recognize this as problematic. First, it becomes very clumsy when you try to add another variant to your sources, because now you have to go and add that to your message and config enums as well. Second, you then have to add that handling logic to everywhere the trait object is used. And finally, the type system "lies" a bit. It seems to me that source A will only ever send messages of the first variant and source B will only ever send messages of the second variant, and that's how we're telling them apart.
So what's a way out here?
First, trait objects should be designed such that they can be used without having to know which implementation we're dealing with. Traits represent roles that structs can play, and code that uses trait objects says "I'm happy to work with anyone who can play that role".
If your code isn't happy to work with anyone who can play that trait's role, there's a flaw in the design.
It would be good to know how you are, in general, processing the messages returned by the sources.
For example, does it matter for the rest of your program that source A only ever returns integers and source B only ever returns floats? And if so, what about it is it that matters, and could that be abstracted behind a trait?

How to idiomatically convert from Option<T> to Result<T, ()>?

I'm a Rust newbie! What's the best way to convert from an Option<T> to a Result<T, ()>?
The TryFrom trait seems prevalent and returns a Result. The popular num_traits' NumCast has many conversions but they all return an Option<T>. Similarly, as do the NonZero* constructors such as NonZeroI32 in the Rust Standard Library. I then noticed that NumCast implements a from() that returns an Option<T> so I was thinking that maybe it had a non-standard way of doing things in general but then I saw the NonZero* implementations and questioned that idea.
Regardless, converting from Options to Results seem frequent and I haven't found a neat way to do yet. E.g.:
/// Add common conversion from an i32 to a non-zero i32.
impl TryFrom<Container<i32>> for Container<NonZeroI32> {
type Error = ();
fn try_from(container: Container<i32>) -> Result<Self, ()> {
// NonZeroI32::new() returns an Option not a Result. Try a helper.
Ok(Self(option_to_result(NonZeroI32::new(container.0))?))
}
}
/// Helper function to convert from an Option to a Result (both types are
/// foreign and so is From).
fn option_to_result<T>(option: Option<T>) -> Result<T, ()> {
if let Some(some) = option {
Ok(some)
} else {
Err(())
}
}
/// Add another common conversion from an i32 to an i16.
impl TryFrom<Container<i32>> for Container<i16> {
type Error = ();
fn try_from(container: Container<i32>) -> Result<Self, ()> {
// NumCast::from() also returns an Option not a Result. Try map_or() instead
// of the helper.
Ok(Self(NumCast::from(container.0).map_or(Err(()), |x| Ok(x))?))
}
}
(Above examples in the Rust Playground.)
These NumCast, NonZero*, and TryFrom conversions seem common enough but my approach feels clumsy as though I'm pitting the Option and Result types against each other. I struggle with these conversions and also miss the fundamental point of the Option type given Result<T,()> feels similar.
So, what's the idiomatic way to convert an Option<T> to Result<T,()> in Rust 2018?
Option has the ok_or method, usable for exactly this (well, for more wide case, in fact, but your request fits here too):
fn option_to_result<T>(option: Option<T>) -> Result<T, ()> {
option.ok_or(())
}
Modified playground

Best practice to return a Result<_, impl Error> and not a Result<_, &str> in Rust?

Is it ok practice to have this style Result?
fn a() -> Result<u32, &'static str>
And then what is the purpose of the Error trait? https://doc.rust-lang.org/std/error/trait.Error.html
Is an impl Error Result better practice?
impl Error for MyError {..... }
fn a() -> Result<u32, MyError>
In short: No, it's not okay. String as error throws away information about details and cause, making errors useless for callers as they won't be able to inspect and possibly recover from it.
In case you just need to fill Error parameter with something, create a unit struct. It's not much useful, but it's also not as volative as string. And you can easily distinguish foo::SomeError from bar::SomeError.
#[derive(Debug)]
pub struct SomeError; // No fields.
In case you can enumerate error variants, use enum.
It is also sometimes useful to "include" other errors into it.
#[derive(Debug)]
pub enum PasswordError {
Empty,
ToShort,
NoDigits,
NoLetters,
NoSpecials
}
#[derive(Debug)]
pub enum ConfigLoadError {
InvalidValues,
DeserializationError(serde::de::Error),
IoError(std::io::Error),
}
Nobody stops you from using structs.
They're particularly useful when you intentionaly want to hide some information from caller (In contrast to enums whose variants always have public visibility). E.g. caller has nothing to do with error message, but can use kind to handle it:
pub enum RegistrationErrorKind {
InvalidName { wrong_char_idx: usize },
NonUniqueName,
WeakPassword,
DatabaseError(db::Error),
}
#[derive(Debug)]
pub struct RegistrationError {
message: String, // Private field
pub kind: RegistrationErrorKind, // Public field
}
impl Error - existential type - makes no sense here. You can't return different error types with it in error place, if this was your intent. And opaque errors are not much useful, just like strings.
std::error::Error trait ensures that your SomeError type has implementation for std::fmt::{Display, Debug} (For displaying error to user and developer, correspondingly) and provides some useful methods like source (This returns the cause of this error); is, downcast, downcast_ref, downcast_mut. Last 4 are for error type erasure.
Error type erasure
Error type erasure has it's tradeoffs, but it's also worth mentioning.
It's also especially useful when writing somewht high-level application code. But in case of libraries you should think twice before deciding to use this approach, because it will make your library unusable with 'no_std'.
Say you have some function with non-trivial logic that can return values of some error types, not exactly one. In this case you can use (But don't abuse) error type erasure:
use std::error::Error;
use std::fmt;
use std::fs;
use std::io::Error as IoError;
use std::net::AddrParseError;
use std::net::Ipv4Addr
use std::path::Path;
// Error for case where file contains '127.0.0.1'
#[derive(Debug)]
pub struct AddressIsLocalhostError;
// Display implementation is required for std::error::Error.
impl fmt::Display for AddressIsLocalhostError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Address is localhost")
}
}
impl Error for AddresIsLocalhostError {} // Defaults are okay here.
// Now we have a function that takes a path and returns
// non-localhost Ipv4Addr on success.
// On fail it can return either of IoError, AddrParseError or AddressIsLocalhostError.
fn non_localhost_ipv4_from_file(path: &Path) -> Result<Ipv4Addr, Box<dyn Error + 'static>> {
// Opening and reading file may cause IoError.
// ? operator will automatically convert it to Box<dyn Error + 'static>.
// (via From trait implementation)
// This way concrete type of error is "erased": we don't know what's
// in a box, in fact it's kind of black box now, but we still can call
// methods that Error trait provides.
let content = fs::read_to_string(path)?;
// Parsing Ipv4Addr from string [slice]
// may cause another error: AddressParseError.
// And ? will convert it to to the same type: Box<dyn Error + 'static>
let addr: Ipv4Addr = content.parse()?;
if addr == Ipv4Add::new(127, 0, 0, 1) {
// Here we perform manual conversion
// from AddressIsLocalhostError
// to Box<dyn Error + 'static> and return error.
return Err(AddressIsLocalhostError.into());
}
// Everyhing is okay, returning addr.
Ok(Ipv4Addr)
}
fn main() {
// Let's try to use our function.
let maybe_address = non_localhost_ipv4_from_file(
"sure_it_contains_localhost.conf"
);
// Let's see what kind of magic Error trait provides!
match maybe_address {
// Print address on success.
Ok(addr) => println!("File was containing address: {}", addr),
Err(err) => {
// We sure can just print this error with.
// println!("{}", err.as_ref());
// Because Error implementation implies Display implementation.
// But let's imagine we want to inspect error.
// Here deref coercion implicitly converts
// `&Box<dyn Error>` to `&dyn Error`.
// And downcast_ref tries to convert this &dyn Error
// back to &IoError, returning either
// Some(&IoError) or none
if Some(err) = err.downcast_ref::<IoError>() {
println!("Unfortunately, IO error occured: {}", err)
}
// There's also downcast_mut, which does the same, but gives us
// mutable reference.
if Some(mut err) = err.downcast_mut::<AddressParseError>() {
// Here we can mutate err. But we'll only print it.
println!(
"Unfortunately, what file was cantaining, \
was not in fact an ipv4 address: {}",
err
);
}
// Finally there's 'is' and 'downcast'.
// 'is' comapres "erased" type with some concrete type.
if err.is::<AddressIsLocalhostError>() {
// 'downcast' tries to convert Box<dyn Error + 'static>
// to box with value of some concrete type.
// Here - to Box<AddressIsLocalhostError>.
let err: Box<AddressIsLocalhostError> =
Error::downcast(err).unwrap();
}
}
};
}
To summarize: errors should (I'd say - must) provide useful information to caller, besides ability to just display them, thus they should not be strings. And errors must implement Error at least to preserve more-less consistent error handling experience across all crates. All the rest depends on situation.
Caio alredy mentioned The Rust Book.
But these links might be also useful:
std::any module level API documentation
std::error::Error API documentation
For simple use-cases, a opaque error type like Result<u32, &'static str> or Result<u32, String> is enough but for more complex libraries, it is useful and even encouraged to create your own error type like a struct MyError or enum AnotherLibError, which helps you to define better your intentions. You may also want to read the Error Handling chapter of the Rust by Example book.
The Error trait, as being part of std, helps developers in a generic and centralized manner to define their own error types to describe what happened and the possible root causes (backtrace). It is currently somewhat limited but there are plans to help to improve its usability.
When you use impl Error, you are telling the compiler that you don't care about the type being returned, as long it implements the Error trait. This approach is useful when the error type is too complex or when you want to generalize over the return type. E.g.:
fn example() -> Result<Duration, impl Error> {
let sys_time = SystemTime::now();
sleep(Duration::from_secs(1));
let new_sys_time = SystemTime::now();
sys_time.duration_since(new_sys_time)
}
The method duration_since returns a Result<Duration, SystemTimeError> type but in the above method signature, you can see that for the Err part of the Result, it is returning anything that implements the Error trait.
Summarizing everything, if you read the Rust book and know what you are doing, you can choose the approach that best fits your needs. Otherwise, it is best to define your own types for your errors or use some third-party utilities like the error-chain or failure crates.

How to call a closure in a method

I try to call a closure from a method do_something implemented for structure A. I read some posts about this, but I'm a bit lost now. Here is a simplified version:
// A simple structure
struct A<F> {
f: F,
}
// Implements new and do_something
impl<F> A<F> where F: Fn() {
fn new(f: F) -> A<F> {
A { f: f }
}
fn do_something(&self) {
self.f() // Error is here.
}
}
fn main() {
let a = A::new( || println!("Test") );
a.do_something()
}
It displays this error:
error: no method named f found for type &A<F> in the current scope
I thought closures were called just like this, but it seems I missed something. I tried to replace self.f() with self.f.call() (random test without really understanding), but it says two thing:
error: this function takes 1 parameter but 0 parameters were supplied
error: explicit use of unboxed closure method call is experimental [E0174]
I'm not sure about the first error, but I think I will not use this feature now if it's experimental.
Is there a way to call a closure in a method?
Wrap the member name in parenthesis:
fn do_something(&self) {
(self.f)()
}
If I recall correctly, the underlying cause has to do with precedence when parsing the code. self.f() looks for a method titled f, and will fail because it doesn't exist. (self.f)() causes it to be parsed differently, specifically looking for a member variable.
The second error is a problem here, unboxed closures are a pain to handle. You need tocould box the closure (place it behind a pointer), because closures can have all sorts of weird sizes.
EDIT: As Shepmaster pointed out, this partially incorrect. I will extend the old answer below because it might help when dealing with closures and passing them around.
(Also, call is experimental, and not necessary in this case, so let's do it without that)
struct A<F> {
f: Box<F>,
}
Now that it's stored in a Box (heap-allocated memory, but you could use other types of indirection when necessary), you should also initialize the structure properly:
fn new(f: F) -> A<F> {
A {
f: Box::new(f)
}
}
Finally, you will be able to call it, right?
fn do_something(&self) {
self.f() // Rust being silly
}
But the Rust compiler is still wanting to call a method here instead of our closure-field. So we explicitly dereference the box:
fn do_something(&self) {
(*self.f)() // Explain our intentions to the confused compiler
}
And now, it works! But do we need the indirection here? I thought so, but it seems like not (thanks Shep)! You see, the A struct is already generic, so it should have a size suitable for any single F type. Thus, we do not need the indirection, and can use the old definition:
struct A<F> {
f: F,
}
But now we've at least got a hint of what's happening here, and do_something can be reduced to
fn do_something(&self) {
(self.f)() // Explicitly access a struct member, then call it
}
So it seems it's just a syntactical limitation of the Rust compiler regarding the call syntax.

Resources