Getting query string from Window object in WebAssembly in Rust - rust

Context: I am learning Rust & WebAssembly and as a practice exercise I have a project that paints stuff in HTML Canvas from Rust code. I want to get the query string from the web request and from there the code can decide which drawing function to call.
I wrote this function to just return the query string with the leading ? removed:
fn decode_request(window: web_sys::Window) -> std::string::String {
let document = window.document().expect("no global window exist");
let location = document.location().expect("no location exists");
let raw_search = location.search().expect("no search exists");
let search_str = raw_search.trim_start_matches("?");
format!("{}", search_str)
}
It does work, but it seems amazingly verbose given how much simpler it would be in some of the other languages I have used.
Is there an easier way to do this? Or is the verbosity just the price you pay for safety in Rust and I should just get used to it?
Edit per answer from #IInspectable:
I tried the chaining approach and I get an error of:
temporary value dropped while borrowed
creates a temporary which is freed while still in use
note: consider using a `let` binding to create a longer lived value rustc(E0716)
It would be nice to understand that better; I am still getting the niceties of ownership through my head. Is now:
fn decode_request(window: Window) -> std::string::String {
let location = window.location();
let search_str = location.search().expect("no search exists");
let search_str = search_str.trim_start_matches('?');
search_str.to_owned()
}
which is certainly an improvement.

This question is really about API design rather than its effects on the implementation. The implementation turned out to be fairly verbose mostly due to the contract chosen: Either produce a value, or die. There's nothing inherently wrong with this contract. A client calling into this function will never observe invalid data, so this is perfectly safe.
This may not be the best option for library code, though. Library code usually lacks context, and cannot make a good call on whether any given error condition is fatal or not. That's a question client code is in a far better position to answer.
Before moving on to explore alternatives, let's rewrite the original code in a more compact fashion, by chaining the calls together, without explicitly assigning each result to a variable:
fn decode_request(window: web_sys::Window) -> std::string::String {
window
.location()
.search().expect("no search exists")
.trim_start_matches('?')
.to_owned()
}
I'm not familiar with the web_sys crate, so there is a bit of guesswork involved. Namely, the assumption, that window.location() returns the same value as the document()'s location(). Apart from chaining calls, the code presented employs two more changes:
trim_start_matches() is passed a character literal in place of a string literal. This produces optimal code without relying on the compiler's optimizer to figure out, that a string of length 1 is attempting to search for a single character.
The return value is constructed by calling to_owned(). The format! macro adds overhead, and eventually calls to_string(). While that would exhibit the same behavior in this case, using the semantically more accurate to_owned() function helps you catch errors at compile time (e.g. if you accidentally returned 42.to_string()).
Alternatives
A more natural way to implement this function is to have it return either a value representing the query string, or no value at all. Rust provides the Option type to conveniently model this:
fn decode_request(window: web_sys::Window) -> Option<String> {
match window
.location()
.search() {
Ok(s) => Some(s.trim_start_matches('?').to_owned()),
_ => None,
}
}
This allows a client of the function to make decisions, depending on whether the function returns Some(s) or None. This maps all error conditions into a None value.
If it is desirable to convey the reason for failure back to the caller, the decode_request function can choose to return a Result value instead, e.g. Result<String, wasm_bindgen::JsValue>. In doing so, an implementation can take advantage of the ? operator, to propagate errors to the caller in a compact way:
fn decode_request(window: web_sys::Window) -> Result<String, wasm_bindgen::JsValue> {
Ok(window
.location()
.search()?
.trim_start_matches('?')
.to_owned())
}

Related

How to handle methods which can panic? [duplicate]

I noticed that Rust does not have exceptions. How to do error handling in Rust and what are the common pitfalls? Are there ways to control flow with raise, catch, reraise and other stuff? I found inconsistent information on this.
Rust generally solves errors in two ways:
Unrecoverable errors. Once you panic!, that's it. Your program or thread aborts because it encounters something it can't solve and its invariants have been violated. E.g. if you find invalid sequences in what should be a UTF-8 string.
Recoverable errors. Also called failures in some documentation. Instead of panicking, you emit a Option<T> or Result<T, E>. In these cases, you have a choice between a valid value Some(T)/Ok(T) respectively or an invalid value None/Error(E). Generally None serves as a null replacement, showing that the value is missing.
Now comes the hard part. Application.
Unwrap
Sometimes dealing with an Option is a pain in the neck, and you are almost guaranteed to get a value and not an error.
In those cases it's perfectly fine to use unwrap. unwrap turns Some(e) and Ok(e) into e, otherwise it panics. Unwrap is a tool to turn your recoverable errors into unrecoverable.
if x.is_some() {
y = x.unwrap(); // perfectly safe, you just checked x is Some
}
Inside the if-block it's perfectly fine to unwrap since it should never panic because we've already checked that it is Some with x.is_some().
If you're writing a library, using unwrap is discouraged because when it panics the user cannot handle the error. Additionally, a future update may change the invariant. Imagine if the example above had if x.is_some() || always_return_true(). The invariant would changed, and unwrap could panic.
? operator / try! macro
What's the ? operator or the try! macro? A short explanation is that it either returns the value inside an Ok() or prematurely returns error.
Here is a simplified definition of what the operator or macro expand to:
macro_rules! try {
($e:expr) => (match $e {
Ok(val) => val,
Err(err) => return Err(err),
});
}
If you use it like this:
let x = File::create("my_file.txt")?;
let x = try!(File::create("my_file.txt"));
It will convert it into this:
let x = match File::create("my_file.txt") {
Ok(val) => val,
Err(err) => return Err(err),
};
The downside is that your functions now return Result.
Combinators
Option and Result have some convenience methods that allow chaining and dealing with errors in an understandable manner. Methods like and, and_then, or, or_else, ok_or, map_err, etc.
For example, you could have a default value in case your value is botched.
let x: Option<i32> = None;
let guaranteed_value = x.or(Some(3)); //it's Some(3)
Or if you want to turn your Option into a Result.
let x = Some("foo");
assert_eq!(x.ok_or("No value found"), Ok("foo"));
let x: Option<&str> = None;
assert_eq!(x.ok_or("No value found"), Err("No value found"));
This is just a brief skim of things you can do. For more explanation, check out:
http://blog.burntsushi.net/rust-error-handling/
https://doc.rust-lang.org/book/ch09-00-error-handling.html
http://lucumr.pocoo.org/2014/10/16/on-error-handling/
If you need to terminate some independent execution unit (a web request, a video frame processing, a GUI event, a source file to compile) but not all your application in completeness, there is a function std::panic::catch_unwind that invokes a closure, capturing the cause of an unwinding panic if one occurs.
let result = panic::catch_unwind(|| {
panic!("oh no!");
});
assert!(result.is_err());
I would not grant this closure write access to any variables that could outlive it, or any other otherwise global state.
The documentation also says the function also may not be able to catch some kinds of panic.

Ergonomically passing a slice of trait objects

I am converting a variety of types to String when they are passed to a function. I'm not concerned about performance as much as ergonomics, so I want the conversion to be implicit. The original, less generic implementation of the function simply used &[impl Into<String>], but I think that it should be possible to pass a variety of types at once without manually converting each to a string.
The key is that ideally, all of the following cases should be valid calls to my function:
// String literals
perform_tasks(&["Hello", "world"]);
// Owned strings
perform_tasks(&[String::from("foo"), String::from("bar")]);
// Non-string types
perform_tasks(&[1,2,3]);
// A mix of any of them
perform_tasks(&["All", 3, String::from("types!")]);
Some various signatures I've attempted to use:
fn perform_tasks(items: &[impl Into<String>])
The original version fails twice; it can't handle numeric types without manual conversion, and it requires all of the arguments to be the same type.
fn perform_tasks(items: &[impl ToString])
This is slightly closer, but it still requires all of the arguments to be of one type.
fn perform_tasks(items: &[&dyn ToString])
Doing it this way is almost enough, but it won't compile unless I manually add a borrow on each argument.
And that's where we are. I suspect that either Borrow or AsRef will be involved in a solution, but I haven't found a way to get them to handle this situation. For convenience, here is a playground link to the final signature in use (without the needed references for it to compile), alongside the various tests.
The following way works for the first three cases if I understand your intention correctly.
pub fn perform_tasks<I, A>(values: I) -> Vec<String>
where
A: ToString,
I: IntoIterator<Item = A>,
{
values.into_iter().map(|s| s.to_string()).collect()
}
As the other comments pointed out, Rust does not support an array of mixed types. However, you can do one extra step to convert them into a &[&dyn fmt::Display] and then call the same function perform_tasks to get their strings.
let slice: &[&dyn std::fmt::Display] = &[&"All", &3, &String::from("types!")];
perform_tasks(slice);
Here is the playground.
If I understand your intention right, what you want is like this
fn main() {
let a = 1;
myfn(a);
}
fn myfn(i: &dyn SomeTrait) {
//do something
}
So it's like implicitly borrow an object as function argument. However, Rust won't let you to implicitly borrow some objects since borrowing is quite an important safety measure in rust and & can help other programmers quickly identified which is a reference and which is not. Thus Rust is designed to enforce the & to avoid confusion.

A cell with interior mutability allowing arbitrary mutation actions

Standard Cell struct provides interior mutability but allows only a few mutation methods such as set(), swap() and replace(). All of these methods change the whole content of the Cell.
However, sometimes more specific manipulations are needed, for example, to change only a part of data contained in the Cell.
So I tried to implement some kind of universal Cell, allowing arbitrary data manipulation.
The manipulation is represented by user-defined closure that accepts a single argument - &mut reference to the interior data of the Cell, so the user itself can deside what to do with the Cell interior. The code below demonstrates the idea:
use std::cell::UnsafeCell;
struct MtCell<Data>{
dcell: UnsafeCell<Data>,
}
impl<Data> MtCell<Data>{
fn new(d: Data) -> MtCell<Data> {
return MtCell{dcell: UnsafeCell::new(d)};
}
fn exec<F, RetType>(&self, func: F) -> RetType where
RetType: Copy,
F: Fn(&mut Data) -> RetType
{
let p = self.dcell.get();
let pd: &mut Data;
unsafe{ pd = &mut *p; }
return func(pd);
}
}
// test:
type MyCell = MtCell<usize>;
fn main(){
let c: MyCell = MyCell::new(5);
println!("initial state: {}", c.exec(|pd| {return *pd;}));
println!("state changed to {}", c.exec(|pd| {
*pd += 10; // modify the interior "in place"
return *pd;
}));
}
However, I have some concerns regarding the code.
Is it safe, i.e can some safe but malicious closure break Rust mutability/borrowing/lifetime rules by using this "universal" cell?
I consider it safe since lifetime of the interior reference parameter prohibits its exposition beyond the closure call time. But I still have doubts (I'm new to Rust).
Maybe I'm re-inventing the wheel and there exist some templates or techniques solving the problem?
Note: I posted the question here (not on code review) as it seems more related to the language rather than code itself (which represents just a concept).
[EDIT] I'd want zero cost abstraction without possibility of runtime failures, so RefCell is not perfect solution.
This is a very common pitfall for Rust beginners.
Is it safe, i.e can some safe but malicious closure break Rust mutability/borrowing/lifetime rules by using this "universal" cell? I consider it safe since lifetime of the interior reference parameter prohibits its exposition beyond the closure call time. But I still have doubts (I'm new to Rust).
In a word, no.
Playground
fn main() {
let mt_cell = MtCell::new(123i8);
mt_cell.exec(|ref1: &mut i8| {
mt_cell.exec(|ref2: &mut i8| {
println!("Double mutable ref!: {:?} {:?}", ref1, ref2);
})
})
}
You're absolutely right that the reference cannot be used outside of the closure, but inside the closure, all bets are off! In fact, pretty much any operation (read or write) on the cell within the closure is undefined behavior (UB), and may cause corruption/crashes anywhere in your program.
Maybe I'm re-inventing the wheel and there exist some templates or techniques solving the problem?
Using Cell is often not the best technique, but it's impossible to know what the best solution is without knowing more about the problem.
If you insist on Cell, there are safe ways to do this. The unstable (ie. beta) Cell::update() method is literally implemented with the following code (when T: Copy):
pub fn update<F>(&self, f: F) -> T
where
F: FnOnce(T) -> T,
{
let old = self.get();
let new = f(old);
self.set(new);
new
}
Or you could use Cell::get_mut(), but I guess that defeats the whole purpose of Cell.
However, usually the best way to change only part of a Cell is by breaking it up into separate Cells. For example, instead of Cell<(i8, i8, i8)>, use (Cell<i8>, Cell<i8>, Cell<i8>).
Still, IMO, Cell is rarely the best solution. Interior mutability is a common design in C and many other languages, but it is somewhat more rare in Rust, at least via shared references and Cell, for a number of reasons (e.g. it's not Sync, and in general people don't expect interior mutability without &mut). Ask yourself why you are using Cell and if it is really impossible to reorganize your code to use normal &mut references.
IMO the bottom line is actually about safety: if no matter what you do, the compiler complains and it seems that you need to use unsafe, then I guarantee you that 99% of the time either:
There's a safe (but possibly complex/unintuitive) way to do it, or
It's actually undefined behavior (like in this case).
EDIT: Frxstrem's answer also has better info about when to use Cell/RefCell.
Your code is not safe, since you can call c.exec inside c.exec to get two mutable references to the cell contents, as demonstrated by this snippet containing only safe code:
let c: MyCell = MyCell::new(5);
c.exec(|n| {
// need `RefCell` to access mutable reference from within `Fn` closure
let n = RefCell::new(n);
c.exec(|m| {
let n = &mut *n.borrow_mut();
// now `n` and `m` are mutable references to the same data, despite using
// no unsafe code. this is BAD!
})
})
In fact, this is exactly the reason why we have both Cell and RefCell:
Cell only allows you to get and set a value and does not allow you to get a mutable reference from an immutable one (thus avoiding the above issue), but it does not have any runtime cost.
RefCell allows you to get a mutable reference from an immutable one, but needs to perform checks at runtime to ensure that this is safe.
As far as I know, there's not really any safe way around this, so you need to make a choice in your code between no runtime cost but less flexibility, and more flexibility but with a small runtime cost.

Are there any nice cases where we should use `unwrap`? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Since using unwrap may be problematic because it crashes in the error scenario, it may be considered as dangerous usage.
What if I am hundred percent sure that it will not crash, like in the following scenarios:
if option.is_some() {
let value = option.unwrap();
}
if result.is_ok() {
let result_value = result.unwrap();
}
Since we already checked the Result and Option there will be no crash with the unwrap() usage. However, we could have used match or if let. In my opinion, either match or if let usage is more elegant.
Let's focus on Result; I'll go back to Option at the end.
The purpose of Result is to signal a result which may succeed or fail with an error. As such, any use of it should fall into this category. Let's ignore the cases where a crate returns Result for impossible-to-fail operations.
By doing what you are doing (checking if result.is_ok() then extracting the value), you're effectively doing the same thing twice. The first time, you're inspecting the content of the Result, and the second, you're checking and extracting unsafely.
This could indeed have been done with a match or map, and both would have been more idiomatic than an if. Consider this case for a moment:
You have an object implementing the following trait:
use std::io::{Error, ErrorKind};
trait Worker {
fn hours_left(&self) -> Result<u8, Error>;
fn allocate_hours(&mut self, hours: u8) -> Result<u8, Error>;
}
We're going to assume hours_left() does exactly what it says on the tin. We'll also assume we have a mutable borrow of Worker. Let's implement allocate_hours().
In order to do so, we'll obviously need to check if our worker has extra hours left over to allocate. You could write it similar to yours:
fn allocate_hours(&mut self, hours: u8) {
let hours_left = self.hours_left();
if (hours_left.is_ok()) {
let remaining_hours = hours_left.unwrap();
if (remaining_hours < hours) {
return Err(Error::new(ErrorKind::NotFound, "Not enough hours left"));
}
// Do the actual operation and return
} else {
return hours_left;
}
}
However, this implementation is both clunky and inefficient. We can simplify this by avoiding unwrap and if statements altogether.
fn allocate_hours(&mut self, hours: u8) -> Result<u8, Error> {
self.hours_left()
.and_then(|hours_left| {
// We are certain that our worker is actually there to receive hours
// but we are not sure if he has enough hours. Check.
match hours_left {
x if x >= hours => Ok(x),
_ => Err(Error::new(ErrorKind::NotFound, "Not enough hours")),
}
})
.map(|hours_left| {
// At this point we are sure the worker has enough hours.
// Do the operations
})
}
We've killed multiple birds with one stone here. We've made our code more readable, easier to follow and we've removed a whole bunch of repeated operations. This is also beginning to look like Rust and less like PHP ;-)
Option is similar and supports the same operations. If you want to process the content of either Option or Result and branch accordingly, and you're using unwrap, there are so many pitfalls you'll inevitably fall into when you forget you unwrapped something.
There are genuine cases where your program should barf out. For those, consider expect(&str) as opposed to unwrap()
In many, many cases you can avoid unwrap and others by more elegant means. However, I think there are situations where it is the correct solution to unwrap.
For example, many methods in Iterator return an Option. Let us assume that you have a nonempty slice (known to be nonempty by invariants) and you want to obtain the maximum, you could do the following:
assert!(!slice.empty()); // known to be nonempty by invariants
do_stuff_with_maximum(slice.iter().max().unwrap());
There are probably several opinions regarding this, but I would argue that using unwrap in the above scenario is perfectly fine - in the presence of the preceeding assert!.
My guideline is: If the parameters I am dealing with are all coming from my own code, not interfacing with 3rd party code, possibly assert!ing invariants, I am fine with unwrap. As soon as I am the slightest bit unsure, I resort to if, match, map and others.
Note that there is also expect which is basically an "unwrap with a comment printed in the error case". However, I have found this to be not-really-ergonomic. Moreover, I found the backtraces a bit hard to read if unwrap fails. Thus, I currently use a macro verify! whose sole argument is an Option or Result and that checks that the value is unwrapable. It is implemented like this:
pub trait TVerifiableByVerifyMacro {
fn is_verify_true(&self) -> bool;
}
impl<T> TVerifiableByVerifyMacro for Option<T> {
fn is_verify_true(&self) -> bool {
self.is_some()
}
}
impl<TOk, TErr> TVerifiableByVerifyMacro for Result<TOk, TErr> {
fn is_verify_true(&self) -> bool {
self.is_ok()
}
}
macro_rules! verify {($e: expr) => {{
let e = $e;
assert!(e.is_verify_true(), "verify!({}): {:?}", stringify!($e), e)
e
}}}
Using this macro, the aforementioned example could be written as:
assert!(!slice.empty()); // known to be nonempty by invariants
do_stuff_with_maximum(verify!(slice.iter().max()).unwrap());
If I can't unwrap the value, I get an error message mentioning slice.iter().max(), so that I can search my codebase quickly for the place where the error occurs. (Which is - in my experience - faster than looking through the backtrace for the origin of the error.)

Should I pass a mutable reference or transfer ownership of a variable in the context of FFI?

I have a program that utilizes a Windows API via a C FFI (via winapi-rs). One of the functions expects a pointer to a pointer to a string as an output parameter. The function will store its result into this string. I'm using a variable of type WideCString for this string.
Can I "just" pass in a mutable ref to a ref to a string into this function (inside an unsafe block) or should I rather use a functionality like .into_raw() and .from_raw() that also moves the ownership of the variable to the C function?
Both versions compile and work but I'm wondering whether I'm buying any disadvantages with the direct way.
Here are the relevant lines from my code utilizing .into_raw and .from_raw.
let mut widestr: WideCString = WideCString::from_str("test").unwrap(); //this is the string where the result should be stored
let mut security_descriptor_ptr: winnt::LPWSTR = widestr.into_raw();
let rtrn3 = unsafe {
advapi32::ConvertSecurityDescriptorToStringSecurityDescriptorW(sd_buffer.as_mut_ptr() as *mut std::os::raw::c_void,
1,
winnt::DACL_SECURITY_INFORMATION,
&mut security_descriptor_ptr,
ptr::null_mut())
};
if rtrn3 == 0 {
match IOError::last_os_error().raw_os_error() {
Some(1008) => println!("Need to fix this errror in get_acl_of_file."), // Do nothing. No idea, why this error occurs
Some(e) => panic!("Unknown OS error in get_acl_of_file {}", e),
None => panic!("That should not happen in get_acl_of_file!"),
}
}
let mut rtr: WideCString = unsafe{WideCString::from_raw(security_descriptor_ptr)};
The description of this parameter in MSDN says:
A pointer to a variable that receives a pointer to a null-terminated security descriptor string. For a description of the string format, see Security Descriptor String Format. To free the returned buffer, call the LocalFree function.
I am expecting the function to change the value of the variable. Doesn't that - per definition - mean that I'm moving ownership?
I am expecting the function to change the value of the variable. Doesn't that - per definition - mean that I'm moving ownership?
No. One key way to think about ownership is: who is responsible for destroying the value when you are done with it.
Competent C APIs (and Microsoft generally falls into this category) document expected ownership rules, although sometimes the words are oblique or assume some level of outside knowledge. This particular function says:
To free the returned buffer, call the LocalFree function.
That means that the ConvertSecurityDescriptorToStringSecurityDescriptorW is going to perform some kind of allocation and return that to the user. Checking out the function declaration, you can also see that they document that parameter as being an "out" parameter:
_Out_ LPTSTR *StringSecurityDescriptor,
Why is it done this way? Because the caller doesn't know how much memory to allocate to store the string 1!
Normally, you'd pass a reference to uninitialized memory into the function which must then initialize it for you.
This compiles, but you didn't provide enough context to actually call it, so who knows if it works:
extern crate advapi32;
extern crate winapi;
extern crate widestring;
use std::{mem, ptr, io};
use winapi::{winnt, PSECURITY_DESCRIPTOR};
use widestring::WideCString;
fn foo(sd_buffer: PSECURITY_DESCRIPTOR) -> WideCString {
let mut security_descriptor = unsafe { mem::uninitialized() };
let retval = unsafe {
advapi32::ConvertSecurityDescriptorToStringSecurityDescriptorW(
sd_buffer,
1,
winnt::DACL_SECURITY_INFORMATION,
&mut security_descriptor,
ptr::null_mut()
)
};
if retval == 0 {
match io::Error::last_os_error().raw_os_error() {
Some(1008) => println!("Need to fix this errror in get_acl_of_file."), // Do nothing. No idea, why this error occurs
Some(e) => panic!("Unknown OS error in get_acl_of_file {}", e),
None => panic!("That should not happen in get_acl_of_file!"),
}
}
unsafe { WideCString::from_raw(security_descriptor) }
}
fn main() {
let x = foo(ptr::null_mut());
println!("{:?}", x);
}
[dependencies]
winapi = { git = "https://github.com/nils-tekampe/winapi-rs/", rev = "1bb62e2c22d0f5833cfa9eec1db2c9cfc2a4a303" }
advapi32-sys = { git = "https://github.com/nils-tekampe/winapi-rs/", rev = "1bb62e2c22d0f5833cfa9eec1db2c9cfc2a4a303" }
widestring = "*"
Answering your questions directly:
Can I "just" pass in a mutable ref to a ref to a string into this function (inside an unsafe block) or should I rather use a functionality like .into_raw() and .from_raw() that also moves the ownership of the variable to the C function?
Neither. The function doesn't expect you to pass it a pointer to a string, it wants you to pass a pointer to a place where it can put a string.
I also just realized after your explanation that (as far as I understood it) in my example, the widestr variable never gets overwritten by the C function. It overwrites the reference to it but not the data itself.
It's very likely that the memory allocated by WideCString::from_str("test") is completely leaked, as nothing has a reference to that pointer after the function call.
Is this a general rule that a C (WinAPI) function will always allocate the buffer by itself (if not following the two step approach where it first returns the size)?
I don't believe there are any general rules between C APIs or even inside of a C API. Especially at a company as big as Microsoft with so much API surface. You need to read the documentation for each method. This is part of the constant drag that can make writing C feel like a slog.
it somehow feels odd for me to hand over uninitialized memory to such a function.
Yep, because there's not really a guarantee that the function initializes it. In fact, it would be wasteful to initialize it in case of failure, so it probably doesn't. It's another thing that Rust seems to have nicer solutions for.
Note that you shouldn't do function calls (e.g. println!) before calling things like last_os_error; those function calls might change the value of the last error!
1 Other Windows APIs actually require a multistep process - you call the function with NULL, it returns the number of bytes you need to allocate, then you call it again

Resources