What is the idiomatic Rust method for R's mapply()? mapply() takes a function, and iterables, and calls the function with the first elements of each iterable as arguments, the second elements, etc.
I am currently using the future_mapply() function in R from the future library to do it in parallel as well, but am finding it to be too slow.
Any help is appreciated.
There is no direct equivalent, as Rust doesn't deal with variadic functions or abstract over tuples of different lengths (better to ignore HLists here). If your number of iterators is fixed, you can use it1.zip(it2).zip(it3).map(|((e1, e2), e3)| f(e1, e2, e3)) or itertools::izip!.
If all your iterators have the same type (i.e. can be put into a Vec) and the function to be applied is fine with receiving the elements as a Vec, you could do something like
std::iter::from_fn(move || {
iter_vec // the vector with your input iterators
.iter_mut()
.map(Iterator::next)
.collect::<Option<Vec<_>>>()
}).map(f)
Playground
I think you'll have to describe your problem a bit more for your question to be properly answered.
Related
I have been searching to find parallelizes map in rust most answer point to rayon crate so I wonder if std::iter::map iterate sequentially by default?
I wonder if std::iter::map iterate sequentially by default?
It does
Rust iterators are lazy, meaning that nothing is computed unless asked explicitly. And they are computed one by one until the iterator is exhausted.
Map , from the documentation:
An iterator that maps the values of iter with f
Is an iterator adaptor, it will apply a transformation function on each item of the iterator one by one (when requested, through the next method).
I have a nested Vec<Vec<f64>> in Rust, and I want to multiply each f64 in place by a value DT. I am currently doing:
dcm_dot.iter_mut().map(|a| a.iter_mut().map(|b| * b * DT));
This works, however, I am getting a lazy iterator warning, that the .map()s must be consumed. Is there a more idiomatically correct way to do this?
Iterators in Rust are lazy so unless you use the result of the .map(),
the closure inside will not even be executed.
In order to ensure that your code actually changes the Vec, you should use .for_each() instead.
Playground
Is there any technical reason Rust is designed to use dot notation for tuples instead of using index notation (t[2])?
let t = (20u32, true, 'b')
t.2 // -> 'b'
Dot notation seems natural in accessing struct's and object's properties. I couldn't find a resource or explanation online.
I had no part in the design decisions, but here's my perspective:
Tuples contain mixed types. That is, the property type_of(t[i]) == type_of(t[j]) cannot be guaranteed.
However, conventional indexing works on the premise that the i in t[i] need not be a compile-time constant, which in turn means that the type of t[i] needs to be uniform for all possible i. This is true in all other rust collections that implement indexing. Specifically, rust types are made indexable through implementing the Index trait, defined as below:
pub trait Index<Idx> where Idx: ?Sized {
type Output: ?Sized;
fn index(&'a self, index: Idx) -> &'a Self::Output;
}
So if you wanted a tuple to implement indexing, what type should Self::Output be? The only way to pull this off would be to make Self::Output an enum, which means that element accesses would have to be wrapped around a useless match t[i] clause (or something similar) on the programmer's side, and you'll be catching type errors at runtime instead of compile-time.
Furthermore, you now have to implement bounds-checking, which is again a runtime error, unless you're clever in your tuple implementation.
You could bypass these issues by requiring that the index by a compile-time constant, but at that point tuple item accesses are pretending to behave like a normal index operation while actually behaving inconsistently with respect to all other rust containers, and there's nothing good about that.
This decision was made in RFC 184. The Motivation section has details:
Right now accessing fields of tuples and tuple structs is incredibly painful—one must rely on pattern-matching alone to extract values. This became such a problem that twelve traits were created in the standard library (core::tuple::Tuple*) to make tuple value accesses easier, adding .valN(), .refN(), and .mutN() methods to help this. But this is not a very nice solution—it requires the traits to be implemented in the standard library, not the language, and for those traits to be imported on use. On the whole this is not a problem, because most of the time std::prelude::* is imported, but this is still a hack which is not a real solution to the problem at hand. It also only supports tuples of length up to twelve, which is normally not a problem but emphasises how bad the current situation is.
The discussion in the associated pull request is also useful.
The reason for using t.2 syntax instead of t[2] is best explained in this comment:
Indexing syntax everywhere else has a consistent type, but a tuple is heterogenous so a[0] and a[1] would have different types.
I want to provide an answer from my experience using a functional language (Ocaml) for the while since I've posted this question.
Apart from #rom1v reference, indexing syntax like a[0] everywhere else also used in some kind of sequence structure, of which tuples aren't. In Ocaml, for instance, a tuple (1, "one") is said to have type int * string, which conforms to the Cartesian product in mathematics (i.e., the plane is R^2 = R * R). Plus, accessing a tuple by nth index is considered unidiomatic.
Due to its polymorphic nature, a tuple can almost be thought of as a record / object, which often prefer dot notation like a.fieldName as a convention to access its field (except in language like Javascript, which treats objects like dictionaries and allows string literal access like a["fieldname"]. The only language I'm aware of that's using indexing syntax to access a field is Lua.
Personally, I think syntax like a.(0) tends to look better than a.0, but this may be intentionally (or not) awkward considering in most functional languages it is ideal to pattern-match a tuple instead of accessing it by its index. Since Rust is also imperative, syntax like a.10 can be a good reminder to pattern-match or "go use a struct" already.
Learning Rust (yay!) and I'm trying to understand the intended idiomatic programming required for certain iterator patterns, while scoring top performance. Note: not Rust's Iterator trait, just a method I've written accepting a closure and applying it to some data I'm pulling off of disk / out of memory.
I was delighted to see that Rust (+LLVM?) took an iterator I had written for sparse matrix entries, and a closure for doing sparse matrix vector multiplication, written as
iterator.map_edges({ |x, y| dst[y] += src[x] });
and inlined the closure's body in the generated code. It went quite fast. :D
If I create two of these iterators, or use the first a second time (not a correctness issue) each instance slows down quite a lot (about 2x in this case), presumably because the optimizer no longer chooses to do specialization because of the multiple call sites, and you end up doing a function call for each element.
I'm trying to understand if there are idiomatic patterns that keep the pleasant experience above (I like it, at least) without sacrificing the performance. My options seem to be (none satisfying this constraint):
Accept dodgy performance (2x slower is not fatal, but no prizes either).
Ask the user to supply a batch-oriented closure, so acting on an iterator over a small batch of data. This exposes a bit much of the internals of the iterator (the data are compressed nicely, and the user needs to know how to unwrap them, or the iterator needs to stage an unwrapped batch in memory).
Make map_edges generic in a type implementing a hypothetical EdgeMapClosure trait, and ask the user to implement such a type for each closure they want to inline. Not tested, but I would guess this exposes distinct methods to LLVM, each of which get nicely inlined. Downside is that the user has to write their own closure (packing relevant state up, etc).
Horrible hacks, like make distinct methods map_edges0, map_edges1, ... . Or add a generic parameter the programmer can use to make the methods distinct, but which is otherwise ignored.
Non-solutions include "just use for pair in iterator.iter() { /* */ }"; this is prep work for a data/task-parallel platform, and I would like to be able to capture/move these closures to work threads rather than capturing the main thread's execution. Maybe the pattern I should be using is to write the above, put it in a lambda/closure, and ship it around instead?
In a perfect world, it would be great to have a pattern which causes each occurrence of map_edges in the source file to result in different specialized methods in the binary, without forcing the entire project to be optimized at some scary level. I'm coming out of an unpleasant relationship with managed languages and JITs where generics would be the only way (I know of) to get this to happen, but Rust and LLVM seem magical enough that I thought there might be a good way. How do Rust's iterators handle this to inline their closure bodies? Or don't they (they should!)?
It seems that the problem is resolved by Rust's new approach to closures outlined at
http://smallcultfollowing.com/babysteps/blog/2014/11/26/purging-proc/
In short, Option 3 above (make functions generic with respect to a new closure type) is now transparently implemented when you make an implementation generic using the new closure traits. Rust produces the type behind the scenes for you.
This question already has answers here:
How do I include the end value in a range?
(2 answers)
Closed 4 years ago.
How do I produce a list containing all the integers in Rust? I'm looking for the equivalent of Haskell's [n..m] or Python's range(n, m+1) but can't find anything.
I'm aware of the int::range function and thought it was what I was looking for, but it is made to iterate over a range, not to produce it.
It is now possible to use ..= in Rust:
let vec: Vec<_> = (n ..= m).collect();
gives you a Vec from all the numbers from n to m.
..= is the inclusive range operator, whereas .. is exclusive.
Note that this answer pertains to a pre-1.0 version of Rust and does not apply for 1.0. Specifically, Vec::from_fn was removed.
There's probably nothing really idiomatic as of now. There is a handful of convenience functions to construct vectors, for example you can use Vec::from_fn:
Vec::from_fn(m+1-n, |i| i+n)
Note that this answer pertains to a pre-1.0 version of Rust and does not apply for 1.0. Specifically, std::iter::range and std::iter::range_inclusive were removed.
As of Rust 1.0.0-alpha, the easiest way to accomplish this is to use the convenience functions provided in the module std::iter: range and range_inclusive, which return iterators generating a list of numbers in the range [low, high) or [low, high], respectively.
In addition, you can build a vector from an iterator using the collect method:
use std::iter::range_inclusive;
let first_hundred: Vec<i32> = range_inclusive(1, 100).collect();
println!("upper bound inclusive: {:?}, exclusive: {:?}",
first_hundred,
range(101, 201).collect::<Vec<_>>());
Note that the return value of collect has its type explicitly specified in both its uses above. Normally, the Rust compiler can infer the types of expressions without an explicit specification, but collect is one of the most common cases for which the type cannot be fully inferred, in this case because it can't infer a concrete type that implements the trait FromIterator<A>, the return type of collect.
The type of a generic return value can be specified either as an explicit type in a let definition statement or inline by using the function::<Type>() syntax. Since inference fails only due to not knowing a concrete type implementing FromIterator<A>, it's possible, when explicitly specifying a generic type, to leave "holes" for type arguments which will be inferred, signified by _. This is done with the second call to collect above—in the expression Vec<_>, it's explicitly specified that the container receiving elements from collect is a Vec<T>, but the compiler figures out what exact type T must be. Currently, integers whose types are left unspecified and can't be inferred fall back to i32 (32-bit machine integer) as a default.
Since Rust 1.26.0 you can use the RangeToInclusive (..=) operator to generate an inclusive range.
let v: Vec<_> = (n..=m).collect()