How type safety is guaranteed? - rust

Just skimming through the Rust guide (guessing game), this code fragment doesn't seem right to me:
let num = match input_num {
Some(num) => num,
None => {
println!("Please input a number!");
continue;
}
};
How does type inference of num work in this scenario? The first match case obviously return a number, whereas the second match case is just println & continue statement, which doesn't return anything(or return ()). How does the compiler assume it's type safe?

Let's look at that piece of code more closely:
loop {
// ... some code omitted ...
let num = match input_num {
Some(num) => num,
None => {
println!("Please input a number!");
continue;
}
};
// ... some code omitted ...
}
The match statement is located inside a loop, and there are several constructs in the language which help control the looping process. break exits from a loop early, while continue skips the rest of the code in the loop and goes back to its beginning (restarts it). So this match above basically can be read basically as "Check the number, and if it is there, assign it to num variable, otherwise output a message and restart from the beginning".
The behavior of "otherwise" branch is important: it ends with a control transfer operation, continue in this case. The compiler sees continue and knows that the loop is going to be restarted. Consequently, it does not really matter what value this branch yields, because it will never be used! It may as well never yield anything.
Such behavior often is modeled with so-called bottom type, which is a subtype of any type and which does not have values at all. Rust does not have subtyping (essentially), so such type is deeply magical. It is denoted as ! in type signatures:
fn always_panic() -> ! {
panic!("oops!")
}
This function always panics, which causes stack unwinding and eventual termination of the thread it was called in, so its return value, if there was one, will never be read or otherwise inspected, so it is absolutely safe not to return anything at all, even if it is used in expression context which expects some concrete type:
let value: int = always_panic();
Because always_panic() has return type !, the compiler knows that it is not going to return anything (in this case because always_panic() starts stack unwinding), it is safe to allow it to be used in place of any type - after all, the value, even if it was there, is never going to be used.
continue works exactly in the same way, but locally. None branch "returns" type !, but Some branch returns value of some concrete numeric type, so the whole match statement is of this numeric type, because the compiler knows that None branch will lead to control transfer, and its result, even if it had one, will never be used.

continue is, along with break and return, "divergent". That is, the compiler knows that control flow does not resume after it, it goes somewhere else. This is also true of any function which returns !; this is how the compiler knows that functions like std::rt::begin_unwind never return.

Related

How to handle methods which can panic? [duplicate]

I noticed that Rust does not have exceptions. How to do error handling in Rust and what are the common pitfalls? Are there ways to control flow with raise, catch, reraise and other stuff? I found inconsistent information on this.
Rust generally solves errors in two ways:
Unrecoverable errors. Once you panic!, that's it. Your program or thread aborts because it encounters something it can't solve and its invariants have been violated. E.g. if you find invalid sequences in what should be a UTF-8 string.
Recoverable errors. Also called failures in some documentation. Instead of panicking, you emit a Option<T> or Result<T, E>. In these cases, you have a choice between a valid value Some(T)/Ok(T) respectively or an invalid value None/Error(E). Generally None serves as a null replacement, showing that the value is missing.
Now comes the hard part. Application.
Unwrap
Sometimes dealing with an Option is a pain in the neck, and you are almost guaranteed to get a value and not an error.
In those cases it's perfectly fine to use unwrap. unwrap turns Some(e) and Ok(e) into e, otherwise it panics. Unwrap is a tool to turn your recoverable errors into unrecoverable.
if x.is_some() {
y = x.unwrap(); // perfectly safe, you just checked x is Some
}
Inside the if-block it's perfectly fine to unwrap since it should never panic because we've already checked that it is Some with x.is_some().
If you're writing a library, using unwrap is discouraged because when it panics the user cannot handle the error. Additionally, a future update may change the invariant. Imagine if the example above had if x.is_some() || always_return_true(). The invariant would changed, and unwrap could panic.
? operator / try! macro
What's the ? operator or the try! macro? A short explanation is that it either returns the value inside an Ok() or prematurely returns error.
Here is a simplified definition of what the operator or macro expand to:
macro_rules! try {
($e:expr) => (match $e {
Ok(val) => val,
Err(err) => return Err(err),
});
}
If you use it like this:
let x = File::create("my_file.txt")?;
let x = try!(File::create("my_file.txt"));
It will convert it into this:
let x = match File::create("my_file.txt") {
Ok(val) => val,
Err(err) => return Err(err),
};
The downside is that your functions now return Result.
Combinators
Option and Result have some convenience methods that allow chaining and dealing with errors in an understandable manner. Methods like and, and_then, or, or_else, ok_or, map_err, etc.
For example, you could have a default value in case your value is botched.
let x: Option<i32> = None;
let guaranteed_value = x.or(Some(3)); //it's Some(3)
Or if you want to turn your Option into a Result.
let x = Some("foo");
assert_eq!(x.ok_or("No value found"), Ok("foo"));
let x: Option<&str> = None;
assert_eq!(x.ok_or("No value found"), Err("No value found"));
This is just a brief skim of things you can do. For more explanation, check out:
http://blog.burntsushi.net/rust-error-handling/
https://doc.rust-lang.org/book/ch09-00-error-handling.html
http://lucumr.pocoo.org/2014/10/16/on-error-handling/
If you need to terminate some independent execution unit (a web request, a video frame processing, a GUI event, a source file to compile) but not all your application in completeness, there is a function std::panic::catch_unwind that invokes a closure, capturing the cause of an unwinding panic if one occurs.
let result = panic::catch_unwind(|| {
panic!("oh no!");
});
assert!(result.is_err());
I would not grant this closure write access to any variables that could outlive it, or any other otherwise global state.
The documentation also says the function also may not be able to catch some kinds of panic.

How does match compiles with `continue` in its arms?

I'm reading The Rust Programming Language book and I stumbled upon a simple expression:
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
How does match work with different kinds of expressions in its arms? E.g. the first arm would simply "return" num so that it's assigned to guess but in the second arm the expression is simply continue. How does match handle that and doesn't "assign" continue to guess but executes it? What happens with the whole assignment expression itself? Is it dropped from the call stack (if that's the correct term)?
continue has a special type: it returns the never type, denoted !.
This type means "the code after that is unreachable". Since continue jumps to the next cycle of the loop, it'll never actually return any value (the same is true for return and break, and it's also the return type of panic!(), including all macros that panic: unreachable!(), todo!(), etc.).
The never type is special because it coerces (converts automatically) to any type (because if something cannot happen, we have no problems thinking about it as u32 or String or whatever - it will just not happen). This means it also unify with any other type, meaning the intersection of any type and ! is the other type.
match requires the expressions' type to unify (as does if). So your code returns the unification of ! and u32 == u32.
You can see that if you'll denote the type (requires nightly, since using the ! type not at return type position is experimental):
let num = match num {
Ok(num) => {
let num: i32 = num;
num
}
Err(()) => {
let never: ! = continue;
never
}
};
Playground.

Getting query string from Window object in WebAssembly in Rust

Context: I am learning Rust & WebAssembly and as a practice exercise I have a project that paints stuff in HTML Canvas from Rust code. I want to get the query string from the web request and from there the code can decide which drawing function to call.
I wrote this function to just return the query string with the leading ? removed:
fn decode_request(window: web_sys::Window) -> std::string::String {
let document = window.document().expect("no global window exist");
let location = document.location().expect("no location exists");
let raw_search = location.search().expect("no search exists");
let search_str = raw_search.trim_start_matches("?");
format!("{}", search_str)
}
It does work, but it seems amazingly verbose given how much simpler it would be in some of the other languages I have used.
Is there an easier way to do this? Or is the verbosity just the price you pay for safety in Rust and I should just get used to it?
Edit per answer from #IInspectable:
I tried the chaining approach and I get an error of:
temporary value dropped while borrowed
creates a temporary which is freed while still in use
note: consider using a `let` binding to create a longer lived value rustc(E0716)
It would be nice to understand that better; I am still getting the niceties of ownership through my head. Is now:
fn decode_request(window: Window) -> std::string::String {
let location = window.location();
let search_str = location.search().expect("no search exists");
let search_str = search_str.trim_start_matches('?');
search_str.to_owned()
}
which is certainly an improvement.
This question is really about API design rather than its effects on the implementation. The implementation turned out to be fairly verbose mostly due to the contract chosen: Either produce a value, or die. There's nothing inherently wrong with this contract. A client calling into this function will never observe invalid data, so this is perfectly safe.
This may not be the best option for library code, though. Library code usually lacks context, and cannot make a good call on whether any given error condition is fatal or not. That's a question client code is in a far better position to answer.
Before moving on to explore alternatives, let's rewrite the original code in a more compact fashion, by chaining the calls together, without explicitly assigning each result to a variable:
fn decode_request(window: web_sys::Window) -> std::string::String {
window
.location()
.search().expect("no search exists")
.trim_start_matches('?')
.to_owned()
}
I'm not familiar with the web_sys crate, so there is a bit of guesswork involved. Namely, the assumption, that window.location() returns the same value as the document()'s location(). Apart from chaining calls, the code presented employs two more changes:
trim_start_matches() is passed a character literal in place of a string literal. This produces optimal code without relying on the compiler's optimizer to figure out, that a string of length 1 is attempting to search for a single character.
The return value is constructed by calling to_owned(). The format! macro adds overhead, and eventually calls to_string(). While that would exhibit the same behavior in this case, using the semantically more accurate to_owned() function helps you catch errors at compile time (e.g. if you accidentally returned 42.to_string()).
Alternatives
A more natural way to implement this function is to have it return either a value representing the query string, or no value at all. Rust provides the Option type to conveniently model this:
fn decode_request(window: web_sys::Window) -> Option<String> {
match window
.location()
.search() {
Ok(s) => Some(s.trim_start_matches('?').to_owned()),
_ => None,
}
}
This allows a client of the function to make decisions, depending on whether the function returns Some(s) or None. This maps all error conditions into a None value.
If it is desirable to convey the reason for failure back to the caller, the decode_request function can choose to return a Result value instead, e.g. Result<String, wasm_bindgen::JsValue>. In doing so, an implementation can take advantage of the ? operator, to propagate errors to the caller in a compact way:
fn decode_request(window: web_sys::Window) -> Result<String, wasm_bindgen::JsValue> {
Ok(window
.location()
.search()?
.trim_start_matches('?')
.to_owned())
}

What does returning "!" mean in Rust?

Recently I came across a function in Rust that returned ! instead of basic type, like this:
fn my_function() -> ! {
// ...
}
What does it mean? I was unable to find piece of information about this in The Rust Book. What data does this function return with such indicator?
It means the function never returns (usually because it unconditionally panics or otherwise ends the program, or because it contains an infinite loop that prevents a return from ever happening).
The appendix describes it as:
! Always empty bottom type for diverging functions
where "diverging" means "never returns".
To give some additional context:
! is the never type; it's a type that has no possible value, so it can never be created. If a function returns !, this means that it never completes.
Examples:
fn panics() -> ! {
panic!()
}
fn loops_forever() -> ! {
loop { }
}
At the moment, the ! type is unstable, so it can only be used in return position. In the future, when the never type is stabilized, we will be able to write things like Result<T, !> (a result that's never an error).
Note that ! can be coerced to any other type. This means that ! is a subtype of every other type. It is often called the "bottom type" because of this. It means that we are allowed to write, for example:
let x: i32 = if some_condition {
42
} else {
panic!("`!` is coerced to `i32`")
};
Since ! doesn't work on stable Rust (except in return position), there's a workaround to get a similar type:
enum Never {}
This enum has no variants, so it can never be created, so it's equivalent to !.

What does () mean as an argument in a function where a parameter of type T is expected?

I am new to Rust and I was reading the Dining Philosophers' tutorial when I found this:
Mutex::new(())
I don't know what the argument inside new means. I read the documentation for Mutex and I still have no idea what it means. I would appreciate an explanation about what is happening under the hood.
() is the empty tuple, also called the unit type -- a tuple with no member types. It is also the only valid value of said type. It has a size of zero (note that it is still Sized, just with a size of 0), making it nonexistent at runtime. This has several useful effects, one of which is being used here.
Here, () is used to create a Mutex with no owned data -- it's just an unlockable and lockable mutex. If we explicitly write out the type inference with the turbofish operator ::<>, we could also write:
Mutex::<()>::new( () )
That is, we're creating a new Mutex that contains a () with the initial value ().
() is simply a tuple with no values; a 0-tuple. The type and the value are spelled the same, both (). The type is sometimes known as the "unit type"; it used to actually be a distinct type in the compiler, but now is just treated as a degenerate tuple. It is a 0-sized type; objects of this type won't ever actually take up any space, though it is a Sized type, just with a size of 0.
It is used for cases where you need to have a value or a type, but you have nothing relevant to put there. For instance, if you have a function that doesn't return a value, and call it in a place that expects a value, you find that it actually returns the value () of type ().
fn nothing() {}
fn main() {
println!("{:?}", nothing());
}
That prints () (playpen).
Another use is when you have a generic type like Result<T, E>, which indicates a success or failure of some operation, and can hold either the the result of the successful operation, or an error indicating why it failed. Some operations, such as std::io::write which have no value to return if successful but want to be able to indicate an error, will return a std::io::Result<()>, which is actually a synonym for Result<(), std::io::Error>; that allows the function to return Ok(()) in the success case, but some meaningful error when it fails.
You might compare it to void in C or C++, which are also used for a lack of return value. However, you cannot ever write an object that has type void, which makes void much less useful in generic programming; you could never have an equivalent Result<void, Error> type, because you couldn't ever construct the Ok case.
In this case, a Mutex normally wraps and object that you want to access; so you can put that object into the mutex, and then access it from the guard that you get when you lock the mutex. However, in this example there is no actual data being guarded, so () is used since you need to put something in there, and Mutex is generic over the type so it can accept any type.

Resources