Rust executing methods concurrently - rust

I'm trying to learn rust and have some issues when trying to work with streams of futures. I have the following code
// Stocks: Vec<Stock> || Stock is my struct that implements method get_stock_depth
let futures = stocks.iter();
let futures = futures.map(|x| x.get_stock_depth());
let stream = stream::iter(futures);
let stream = stream.buffer_unordered(10);
let result = stream.collect().await;
Stocks vector contains over 800 objects and i figured I'd like to limit concurrent executions. When i'm running the following code i get the following error
type inside async block must be known in this context cannot infer
type for type parameter C declared on the associated function
collect
Am i missing something?

This almost certainly has nothing to do with async or futures. This is just the normal requirement for collect to provide a type. collect() can create a number of return types and doesn't know what you want. You probably want a Vec like:
let result: Vec<_> = stream.collect().await;
You don't typically need to tell collect what to fill the Vec with (it can usually figure that out), so you can use _, but you do need to tell it what collection type you want.
You might also write this as:
let result = stream.collect::<Vec<_>>().await;
Or if this is the last line of a function that returns result, you can use type inference on the return type by dropping the assignment and the semicolon:
stream.collect().await

Related

Use of moved value

I have a simple function in Rust that iterates through numbers and adds them to a vector if they fulfill a condition. This condition is a function that uses a previously defined variable, prime_factors.
The is_multiperfect function only needs to look things up in the prime_factors variable.
fn get_all_mpn_below(integer: usize) -> Vec<usize> {
let prime_factors = get_prime_factors_below(integer);
let mut mpn = vec![1];
for n in (2..integer).step_by(2) {
if is_multiperfect(n, prime_factors) {
mpn.push(n);
}
}
return mpn;
}
However, this yields the following error:
use of moved value: `prime_factors`
let prime_factors = get_prime_factors_below(integer);
------------- move occurs because `prime_factors` has type `HashMap<usize, Vec<usize>>`, which does not implement the `Copy` trait
if is_multiperfect(n, prime_factors) {
^^^^^^^^^^^^^ value moved here, in previous iteration of loop
I've looked up the error and found it was about ownership, however I fail to understand how ownership applies here.
How can I fix this error?
as I don't declare another variable.
Why would you think that's relevant?
Moving is simply the default behaviour of Rust when transferring values (whether setting them, or passing them to function, or returning them from functions). This occurs for all types which are not Copy.
How can I fix this error?
Hard to say since the problem is is_multiperfect and you don't provide that code, so the reader, not being psychic, has no way to know what is_multiperfect wants out of prime_factors.
Possible solutions are:
clone() the map, this creates a complete copy which the callee can use however it wants, leaving the original available, this gives the callee complete freedom but incurs a large cost for the caller
pass the map as an &mut (unique / mutable reference), if the callee needs to update it
pass the map as an & (shared reference), if the callee just needs to look things up in the map

Writing expression in polars-lazy in rust

I need to write my own expression in polars_lazy. Based on my understanding from the source code I need to write a function that returns Expr::Function. The problem is that in order to construct an object of this type, an object of type FunctionOptions must be provided. The caveat is that this class is public but the members are pub(crate) and thus outside of the create one cannot construct such an object.
Are there ways around this?
I don't think you're meant to directly construct Exprs. Instead, you can use functions like polars_lazy::dsl::col() and polars_lazy::dsl::lit() to create expressions, then use methods on Expr to build up the expression. Several of those methods, such as map() and apply(), will give you an Expr::Function.
Personally I think the Rust API for polars is not well documented enough to really use yet. Although the other answer and comments mention apply and map, they don't mention how or the trade-offs. I hope this answer prompts others to correct me with the "right" way to do things.
So first, here's how to use apply on lazy dataframe, even though lazy dataframes don't take apply directly as a method as eager ones do, and mutating in-place:
// not sure how you'd find this type easily from apply documentation
let o = GetOutput::from_type(DataType::UInt32);
// this mutates two in place
let lf = lf.with_column(col("two").apply(str_to_len, o));
And here's how to use it while not mutating the source column and adding a new output column instead:
let o = GetOutput::from_type(DataType::UInt32);
// this adds new column len, two is unchanged
let lf = lf.with_column(col("two").alias("len").apply(str_to_len, o));
With the str_to_len looking like:
fn str_to_len(str_val: Series) -> Result<Series> {
let x = str_val
.utf8()
.unwrap()
.into_iter()
// your actual custom function would be in this map
.map(|opt_name: Option<&str>| opt_name.map(|name: &str| name.len() as u32))
.collect::<UInt32Chunked>();
Ok(x.into_series())
}
Note that it takes Series rather than &Series and wraps in Result.
With a regular (non-lazy) dataframe, apply still mutates but doesn't require with_column:
df.apply("two", str_to_len).expect("applied");
Whereas eager/non-lazy's with_column doesn't require apply:
// the fn we use to make the column names it too
df.with_column(str_to_len(df.column("two").expect("has two"))).expect("with_column");
And str_to_len has slightly different signature:
fn str_to_len(str_val: &Series) -> Series {
let mut x = str_val
.utf8()
.unwrap()
.into_iter()
.map(|opt_name: Option<&str>| opt_name.map(|name: &str| name.len() as u32))
.collect::<UInt32Chunked>();
// NB. this is naming the chunked array, before we even get to a series
x.rename("len");
x.into_series()
}
I know there's reasons to have lazy and eager operate differently, but I wish the Rust documentation made this easier to figure out.

Getting query string from Window object in WebAssembly in Rust

Context: I am learning Rust & WebAssembly and as a practice exercise I have a project that paints stuff in HTML Canvas from Rust code. I want to get the query string from the web request and from there the code can decide which drawing function to call.
I wrote this function to just return the query string with the leading ? removed:
fn decode_request(window: web_sys::Window) -> std::string::String {
let document = window.document().expect("no global window exist");
let location = document.location().expect("no location exists");
let raw_search = location.search().expect("no search exists");
let search_str = raw_search.trim_start_matches("?");
format!("{}", search_str)
}
It does work, but it seems amazingly verbose given how much simpler it would be in some of the other languages I have used.
Is there an easier way to do this? Or is the verbosity just the price you pay for safety in Rust and I should just get used to it?
Edit per answer from #IInspectable:
I tried the chaining approach and I get an error of:
temporary value dropped while borrowed
creates a temporary which is freed while still in use
note: consider using a `let` binding to create a longer lived value rustc(E0716)
It would be nice to understand that better; I am still getting the niceties of ownership through my head. Is now:
fn decode_request(window: Window) -> std::string::String {
let location = window.location();
let search_str = location.search().expect("no search exists");
let search_str = search_str.trim_start_matches('?');
search_str.to_owned()
}
which is certainly an improvement.
This question is really about API design rather than its effects on the implementation. The implementation turned out to be fairly verbose mostly due to the contract chosen: Either produce a value, or die. There's nothing inherently wrong with this contract. A client calling into this function will never observe invalid data, so this is perfectly safe.
This may not be the best option for library code, though. Library code usually lacks context, and cannot make a good call on whether any given error condition is fatal or not. That's a question client code is in a far better position to answer.
Before moving on to explore alternatives, let's rewrite the original code in a more compact fashion, by chaining the calls together, without explicitly assigning each result to a variable:
fn decode_request(window: web_sys::Window) -> std::string::String {
window
.location()
.search().expect("no search exists")
.trim_start_matches('?')
.to_owned()
}
I'm not familiar with the web_sys crate, so there is a bit of guesswork involved. Namely, the assumption, that window.location() returns the same value as the document()'s location(). Apart from chaining calls, the code presented employs two more changes:
trim_start_matches() is passed a character literal in place of a string literal. This produces optimal code without relying on the compiler's optimizer to figure out, that a string of length 1 is attempting to search for a single character.
The return value is constructed by calling to_owned(). The format! macro adds overhead, and eventually calls to_string(). While that would exhibit the same behavior in this case, using the semantically more accurate to_owned() function helps you catch errors at compile time (e.g. if you accidentally returned 42.to_string()).
Alternatives
A more natural way to implement this function is to have it return either a value representing the query string, or no value at all. Rust provides the Option type to conveniently model this:
fn decode_request(window: web_sys::Window) -> Option<String> {
match window
.location()
.search() {
Ok(s) => Some(s.trim_start_matches('?').to_owned()),
_ => None,
}
}
This allows a client of the function to make decisions, depending on whether the function returns Some(s) or None. This maps all error conditions into a None value.
If it is desirable to convey the reason for failure back to the caller, the decode_request function can choose to return a Result value instead, e.g. Result<String, wasm_bindgen::JsValue>. In doing so, an implementation can take advantage of the ? operator, to propagate errors to the caller in a compact way:
fn decode_request(window: web_sys::Window) -> Result<String, wasm_bindgen::JsValue> {
Ok(window
.location()
.search()?
.trim_start_matches('?')
.to_owned())
}

Why does `thread::JoinHandle<T>` have a type parameter?

In Rust, the thread::JoinHandle<T> type included with the standard library has the type parameter T. However, it doesn't seem that T is actually set or used for anything.
Indeed, Rust's own documentation mostly just uses thread::JoinHandle<_> whenever it needs to assign a JoinHandle<T> to something. What does this T actually do?
It's the type that's returned from the threaded code. You can always auto-deduce the type, and you generally don't want to write the type explicitly. There's a few examples in the documentation for join, which returns a Result<T>.
The following example is from the documentation:
spawn returns a JoinHandle, which when joined returns the Result.
let computation = thread::spawn(|| {
// Some expensive computation.
42
});
let result = computation.join().unwrap();
println!("{}", result);

How to use .collect() on each iterator returned by .unzip()?

I have the following code, in which fac return (MyType, OtherType):
let l = (-1..13).map(|x| {
fac(x).0
}).collect::<Vec<MyType>>();
It works, but I'm throwing away the OtherType values. So I decided to use .unzip, like this:
let (v, r) = (-1..13).map(|x| {
fac(x)
}).unzip();
let l = v.collect::<Vec<MyType>>();
let q = r.collect::<Vec<OtherType>>();
But type inference fails with:
error: the type of this value must be known in this context
let l = v.collect::<Vec<Literal>>();
^~~~~~~~~~~~~~~~~~~~~~~~~~~
let q = r.collect::<Vec<OtherType>>();
^~~~~~~~~~~~~~~~~~~~~~~~~~~
The thing is: I don't know or care what is the concrete type of the iterators (and I would suppose the compiler could infer them, as shown in the first snippet). How to satisfy the compiler in this case?
Also, I would prefer to restructure the code - I don't like to separately call .collect() on both v and r. Ideally I would continue the method chain after .unzip(), returning two Vecs in that expression.
.unzip() doesn't return iterators — it acts like two parallel collect! You can in fact collect the two pieces to different kinds of collections, but let's use vectors for both in this example:
// Give a type hint to determine the collection type
let (v, r): (Vec<MyType>, Vec<OtherType>) = (-1..13).map(|x| {
fac(x)
}).unzip();
It is done this way to be as simple and transparent as possible. Returning two iterators instead would need them to share a common state, a complexity that rust's iterator library prefers to avoid.

Resources