I can create an array from a tuple like this:
let a = (1, 2, 3);
let b = [a.0, a.1, a.2];
Is there a way to do it without naming each element of the tuple? Something like:
let b = a.to_array();
There is no such functionality at the moment, however it would be perfectly possible to extend the set of implementations of the From trait to cover this usecase (and its reverse).
This extension would have to be in the core crate because of the orphan rules, but we can readily demonstrate it with custom traits:
use std::convert::Into;
trait MyFrom<T> {
fn my_from(t: T) -> Self;
}
trait MyInto<U> {
fn my_into(self) -> U;
}
impl<T, U> MyInto<U> for T
where
U: MyFrom<T>
{
fn my_into(self) -> U { <U as MyFrom<T>>::my_from(self) }
}
impl<T> MyFrom<()> for [T; 0] {
fn my_from(_: ()) -> Self { [] }
}
impl<T, A> MyFrom<(A,)> for [T; 1]
where
A: Into<T>,
{
fn my_from(t: (A,)) -> Self { [t.0.into()] }
}
impl<T, A, B> MyFrom<(A, B)> for [T; 2]
where
A: Into<T>,
B: Into<T>,
{
fn my_from(t: (A, B)) -> Self { [t.0.into(), t.1.into()] }
}
Once define, it's easy enough to use:
fn main() {
{
let array: [i64; 0] = ().my_into();
println!("{:?}", array);
}
{
let array: [i64; 1] = (1u32,).my_into();
println!("{:?}", array);
}
{
let array: [i64; 2] = (1u32, 2i16).my_into();
println!("{:?}", array);
}
}
will print:
[]
[1]
[1, 2]
The reverse implementation would be as easy, there's nothing mysterious here it's just boilerplate (hurray for macros!).
No, there isn't. What is more, you can't even iterate over tuples. The tuple is heterogeneous, so it's unfit for a conversion to a homogeneous type like a vector or an array.
You could write a macro to allow iteration over the contents of a tuple of a generic length and collect them (as long as all its elements are of the same type), but you would still have to access/process every element individually.
Related
I'm writing a library to parse json data that looks like this:
{"x": [[1, "a"], [2, "b"]]}
i.e. I have a key with a list of lists where the inner lists can contain different data types but each inner list has the same sequence of types. The sequence of types for an inner list can change for different json schemas but will be known ahead of time.
The desired output would look something like:
vec![vec![1,2], vec!["a", "b"]]
(with the data wrapped in some appropriate enum for the different dtypes).
I began implementing DeserializeSeed for Vec<DataTypes>, below is some similar pseudo-code.
enum DataTypes {
I32,
I64,
String,
F32,
F64
}
fn visit_seq<S>(self, mut seq: S) -> Result<Self::Value, S::Error>
where
S: SeqAccess<'de>,
{
let types: Vec<DataTypes> = self.0.data;
let out: Vec<Vec<...>>;
while let Some(inner_seq: S) = seq.next_element::<S>()? { // <-- this is the line
for (i, type) in types.enumerate() {
match type {
DataTypes::I32 => out[i].push(inner_seq.next_element::<i32>()?),
DataTypes::I64 => out[i].push(inner_seq.next_element::<i64>()?),
...
}
}
}
}
My problem is I can't seem to find a way to get SeqAccess for the inner lists and I don't want to deserialize them into something like Vec<serde_json::Value> because I don't want to have to allocate the additional vector.
Please fasten your seat belts, this is verbose.
I'm assuming you want to deserialize some JSON data
{"x": [[1, "a"], [2, "b"]]}
to some Rust struct
struct X {
x: Vec<Vec<Value>>, // Value is some enum containing string/int/float…
}
all while
transposing the elements of the inner lists while inserting into the vectors
checking that the inner vector elements conform to some type passed to deserialization
not doing any transient allocations
At the start, you have to realize that you have three different types that you want to deserialize: X, Vec<Vec<Value>>>, and Vec<Value>. (Value itself you don't need, because what you actually want to deserialize are strings and ints and whatnot, not Value itself.) So, you need three deserializers, and three visitors.
The innermost Deserialize has a mutable reference to a Vec<Vec<Value>>, and distributes the elements of a single [1, "a"], one to each Vec<Value>.
struct ExtendVecs<'a>(&'a mut Vec<Vec<Value>>, &'a [DataTypes]);
impl<'de, 'a> DeserializeSeed<'de> for ExtendVecs<'a> {
type Value = ();
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct ExtendVecVisitor<'a>(&'a mut Vec<Vec<Value>>, &'a [DataTypes]);
impl<'de, 'a> Visitor<'de> for ExtendVecVisitor<'a> {
type Value = ();
fn visit_seq<A>(self, mut seq: A) -> Result<(), A::Error>
where
A: SeqAccess<'de>,
{
for (i, typ) in self.1.iter().enumerate() {
match typ {
// too_short checks for None and turns it into Err("expected more elements")
DataTypes::Stri => self.0[i].push(Value::Stri(too_short(self.1, seq.next_element::<String>())?)),
DataTypes::Numb => self.0[i].push(Value::Numb(too_short(self.1, seq.next_element::<f64>())?)),
}
}
// TODO: check all elements consumed
Ok(())
}
}
deserializer.deserialize_seq(ExtendVecVisitor(self.0, self.1))
}
}
The middle Deserialize constructs the Vec<Vec<Value>>, gives the innermost ExtendVecs access to the Vec<Vec<Value>>, and asks ExtendVecs to have a look at each of the [[…], […]]:
struct TransposeVecs<'a>(&'a [DataTypes]);
impl<'de, 'a> DeserializeSeed<'de> for TransposeVecs<'a> {
type Value = Vec<Vec<Value>>;
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct TransposeVecsVisitor<'a>(&'a [DataTypes]);
impl<'de, 'a> Visitor<'de> for TransposeVecsVisitor<'a> {
type Value = Vec<Vec<Value>>;
fn visit_seq<A>(self, mut seq: A) -> Result<Vec<Vec<Value>>, A::Error>
where
A: SeqAccess<'de>,
{
let mut vec = Vec::new();
vec.resize_with(self.0.len(), || vec![]);
while let Some(()) = seq.next_element_seed(ExtendVecs(&mut vec, self.0))? {}
Ok(vec)
}
}
Ok(deserializer.deserialize_seq(TransposeVecsVisitor(self.0))?)
}
}
Finally, the outermost Deserialize is nothing special anymore, it just hands access to the type array down:
struct XD<'a>(&'a [DataTypes]);
impl<'de, 'a> DeserializeSeed<'de> for XD<'a> {
type Value = X;
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct XV<'a>(&'a [DataTypes]);
impl<'de, 'a> Visitor<'de> for XV<'a> {
type Value = X;
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
where
A: serde::de::MapAccess<'de>,
{
let k = map.next_key::<String>()?;
// TODO: check k = "x"
Ok(X { x: map.next_value_seed(TransposeVecs(self.0))? })
}
}
Ok(deserializer.deserialize_struct("X", &["x"], XV(self.0))?)
}
}
Now, you can seed the outermost Deserialize with your desired type list and use it to deserialize one X, e.g.:
XD(&[DataTypes::Numb, DataTypes::Stri]).deserialize(
&mut serde_json::Deserializer::from_str(r#"{"x": [[1, "a"], [2, "b"]]}"#)
)
Playground with all the left-out error handling
Side node: If you can (i.e. if the format you're deserializing is self-describing like JSON) I'd recommend to do the type checking after deserialization. Why? Because doing it during means that all deserializers up to the top deserializer must be DeserializeSeed, and you can't use #[derive(Deserialize)]. If you do the type checking after, you can #[derive(Deserialize)] and #[serde(deserialize_with = "TransposeVecs_deserialize_as_free_function")} x: Vec<Vec<Value>>, and save half of the cruft in this post.
I'm new to Rust and still learning how to represent different designs patterns in the language, plus my recent background is all heavily OO design which is not helping think about the problem.
In the example below I want to eliminate the sort_ascending condition into something more elegant and ideally have it fully resolve at compile time.
struct MyList {
ladder: Vec<i32>,
sort_ascending: bool,
}
impl MyList {
pub fn new(data: &[i32], sort_ascending: bool) -> MyList {
return MyList {
ladder: data.to_vec(),
sort_ascending: sort_ascending,
};
}
pub fn get_best(&self) -> Option<&i32> {
if self.sort_ascending {
return self.ladder.iter().reduce(|a, b| if a >= b { a } else { b });
} else {
return self.ladder.iter().reduce(|a, b| if a <= b { a } else { b });
}
}
}
fn main() {
let x = MyList::new(&[10, 4, 30, 2, 5, 2], true);
let r = x.get_best();
println!("{:?}", r);
let x = MyList::new(&[10, 4, 30, 2, 5, 2], false);
let r = x.get_best();
println!("{:?}", r);
}
I think the solution lies in making the closure passed to reduce() configurable and have tried many variations along the following lines without success:
struct MyList<F>
where F : Fn(i32,i32) -> i32
{
ladder: Vec<i32>,
sort_closure : F
}
impl<F> MyList<F>
where F : Fn(i32,i32) -> i32
{
pub fn new(data: &[i32], sort_ascending: bool) -> MyList<F> {
if sort_ascending {
return MyList {
ladder: data.to_vec(),
sort_closure : |a, b| if a >= b { a } else { b },
};
} else {
return MyList {
ladder: data.to_vec(),
sort_closure : |a, b| if a <= b { a } else { b },
};
}
}
pub fn get_best(&self) -> Option<&i32> {
return self.ladder.iter().reduce(self.sort_closure);
}
}
fn main() {
let x = MyList::new(&[10, 4, 30, 2, 5, 2], true);
let r = x.get_best();
println!("{:?}", r);
let x = MyList::new(&[10, 4, 30, 2, 5, 2], false);
let r = x.get_best();
println!("{:?}", r);
}
I would really appreciate any pointers on how to make the above code compile, or a pointer towards what would be the correct approach to take to implement this pattern.
A straightforward fix is to use a function pointer (fn) instead of trying to use a generic parameter on MyList.
The following solution is similar to what user Jmb suggested in the comment above, but simplified further by
Avoiding lifetimes in the function pointer
Using std::cmp::{max, min} instead of writing our own closures (though the code still works if you were to define the closures yourself, as before)
use std::cmp;
struct MyList {
ladder: Vec<i32>,
sort_closure: fn(i32, i32) -> i32,
}
impl MyList {
pub fn new(data: &[i32], sort_ascending: bool) -> MyList {
MyList {
ladder: data.to_vec(),
sort_closure: match sort_ascending {
true => cmp::max,
false => cmp::min,
},
}
}
pub fn get_best(&self) -> Option<i32> {
self.ladder.iter().copied().reduce(self.sort_closure)
}
}
Why the original approach doesn't work
This is what the MyList definition looked like:
struct MyList<F>
where F : Fn(i32,i32) -> i32
{
ladder: Vec<i32>,
sort_closure : F
}
The reason this doesn't work has to do with the type parameter F on MyList. Having the type paramter there means the caller (the code in main()) is free to instantiate MyList<F> with any F of their choosing that satisfies the Fn(i32,i32) -> i32 constraint. However, inside the MyList::new() function, we are defining our own two closures, then trying to store one of them in MyList. This is the contradiction. The caller doesn't get to pick the type of the closure to be stored, except indirectly with the sort_ascending flag.
I'd like a method like Iterator::chain() that only computes the argument iterator when it's needed. In the following code, expensive_function should never be called:
use std::{thread, time};
fn expensive_function() -> Vec<u64> {
thread::sleep(time::Duration::from_secs(5));
vec![4, 5, 6]
}
pub fn main() {
let nums = [1, 2, 3];
for &i in nums.iter().chain(expensive_function().iter()) {
if i > 2 {
break;
} else {
println!("{}", i);
}
}
}
One possible approach: delegate the expensive computation to an iterator adaptor.
let nums = [1, 2, 3];
for i in nums.iter()
.cloned()
.chain([()].into_iter().flat_map(|_| expensive_function()))
{
if i > 2 {
break;
} else {
println!("{}", i);
}
}
Playground
The passed iterator is the result of flat-mapping a dummy unit value () to the list of values, which is lazy. Since the iterator needs to own the respective outcome of that computation, I chose to copy the number from the array.
You can create your own custom iterator adapter that only evaluates a closure when the original iterator is exhausted.
trait IteratorExt: Iterator {
fn chain_with<F, I>(self, f: F) -> ChainWith<Self, F, I::IntoIter>
where
Self: Sized,
F: FnOnce() -> I,
I: IntoIterator<Item = Self::Item>,
{
ChainWith {
base: self,
factory: Some(f),
iterator: None,
}
}
}
impl<I: Iterator> IteratorExt for I {}
struct ChainWith<B, F, I> {
base: B,
factory: Option<F>,
iterator: Option<I>,
}
impl<B, F, I> Iterator for ChainWith<B, F, I::IntoIter>
where
B: Iterator,
F: FnOnce() -> I,
I: IntoIterator<Item = B::Item>,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if let Some(b) = self.base.next() {
return Some(b);
}
// Exhausted the first, generate the second
if let Some(f) = self.factory.take() {
self.iterator = Some(f().into_iter());
}
self.iterator
.as_mut()
.expect("There must be an iterator")
.next()
}
}
use std::{thread, time};
fn expensive_function() -> Vec<u64> {
panic!("You lose, good day sir");
thread::sleep(time::Duration::from_secs(5));
vec![4, 5, 6]
}
pub fn main() {
let nums = [1, 2, 3];
for i in nums.iter().cloned().chain_with(|| expensive_function()) {
if i > 2 {
break;
} else {
println!("{}", i);
}
}
}
The code below works. It evaluates x and ys lazily and caches into Foo::x: Cell, Foo::ys: RefCell respectively.
However, I feel there might be a better way to do it. I dislike I have to make a wrapper CacheVecGuard so that on the call site I can use self.borrow_ys() instead of the lengthy &self.ys.borrow().1.
How can I improve this piece of code?
Are there any canonical snippets to do lazy evaluation or memoization that is suitable in this case? (I am aware of lazy_static which doesn't fit)
use std::cell::{RefCell, Cell, Ref};
use std::ops::Deref;
struct CacheVecGuard<'a>(Ref<'a, (bool, Vec<f64>)>);
impl<'a> Deref for CacheVecGuard<'a> {
type Target = [f64];
fn deref(&self) -> &Self::Target {
&(self.0).1
}
}
fn pre_calculate_x(x: f64) -> f64 {
x
}
fn pre_calculate_ys(x: f64, ys: &mut [f64]) {
for i in 0..ys.len() {
ys[i] += 1.0;
}
}
struct Foo {
pub a: f64,
x: Cell<Option<f64>>,
ys: RefCell<(bool, Vec<f64>)>,
}
impl Foo {
pub fn new(a: f64) -> Self {
Self {
a,
x: Cell::new(None),
ys: RefCell::new((false, vec![0.0; 10])),
}
}
fn get_x(&self) -> f64 {
match self.x.get() {
None => {
let x = pre_calculate_x(self.a);
self.x.set(Some(x));
println!("Set x to {}", x);
x
}
Some(x) => x,
}
}
fn borrow_ys(&self) -> CacheVecGuard {
{
let (ref mut ready, ref mut ys) = *self.ys.borrow_mut();
if !*ready {
pre_calculate_ys(self.a, ys);
println!("Set ys to {:?}", ys);
*ready = true;
}
}
CacheVecGuard(self.ys.borrow())
}
fn clear_cache(&mut self) {
*(&mut self.ys.borrow_mut().0) = false;
self.x.set(None);
}
pub fn test(&self) -> f64 {
self.borrow_ys()[0] + self.get_x()
}
pub fn set_a(&mut self, a: f64) {
self.a = a;
self.clear_cache();
}
}
fn main() {
let mut foo = Foo::new(1.0);
println!("{}", foo.test());
foo.set_a(3.0);
println!("{}", foo.test());
}
It prints
Set ys to [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Set x to 1
2
Set ys to [2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
Set x to 3
5
Playground
The fact that you need the ability to clear the cache means that you must have a guard. Otherwise, a call to set_a could invalidate a bare reference returned earlier by borrow_ys. The only way the compiler can verify that this doesn't happen is by returning a guard and borrowing from the guard instead.
If you can do away with the ability to clear the cache, you could use the LazyCell type from the lazycell crate instead.
I want to get the size of all dimensions of an array in Rust but I'm not sure how to go about this. I'm able to get the length of the array using x.len() but I need to somehow do this recursively.
I want to be able to do something like this:
let x = [[1, 2, 3], [4, 5, 6]];
println!("{:?}", x.dimensions());
// [2, 3]
A slice with a shape like [[1], [2, 3], [4, 5, 6]] should give an error.
It's not possible to do this in a generic fashion for every possible depth of nesting. Rust is a statically typed language, so you have to know your input and output types. What is an input type for [1] and what is the input type for [[1]]? Likewise, what are the corresponding output types?
The closest I know of is a trait with an associated type. This allows implementing it for a specific type which then associates another output type:
trait Thing {
type Dimensions;
fn thing(self) -> Self::Dimensions;
}
However, as soon as you implement it, you run into problems:
impl<'a, T> Thing for &'a[T] {
type Dimensions = usize;
fn thing(self) -> usize {
self.len()
}
}
impl<'a, T> Thing for &'a[&'a[T]] {
type Dimensions = [usize; 2];
fn thing(self) -> Self::Dimensions {
[self.len(), self[0].len()]
}
}
error[E0119]: conflicting implementations of trait `Thing` for type `&[&[_]]`:
--> src/main.rs:14:1
|
6 | impl<'a, T> Thing for &'a[T] {
| - first implementation here
...
14 | impl<'a, T> Thing for &'a[&'a[T]] {
| ^ conflicting implementation for `&[&[_]]`
That's because a &[[T]] is a &[T].
You may also think to try something recursive, but there's no way to say &[T] and know if T can be further iterated or not. If you had an HasLength trait and a DoesntHaveLength trait, nothing stops you from implementing both traits for a single type. Thus, you are stopped again.
Here's one partial attempt at using specialization:
#![feature(specialization)]
trait Dimensions: Sized {
fn dimensions(self) -> Vec<usize> {
let mut answers = vec![];
self.dimensions_core(&mut answers);
answers
}
fn dimensions_core(self, &mut Vec<usize>);
}
impl<'a, T> Dimensions for &'a [T] {
default fn dimensions_core(self, answers: &mut Vec<usize>) {
answers.push(self.len());
}
}
impl<'a, T> Dimensions for &'a [T]
where T: Dimensions + Copy
{
fn dimensions_core(self, answers: &mut Vec<usize>) {
answers.push(self.len());
self[0].dimensions_core(answers);
}
}
impl<'a, T> Dimensions for [T; 2] {
default fn dimensions_core(self, answers: &mut Vec<usize>) {
answers.push(2)
}
}
impl<'a, T> Dimensions for [T; 2]
where T: Dimensions + Copy
{
fn dimensions_core(self, answers: &mut Vec<usize>) {
answers.push(2);
self[0].dimensions_core(answers);
}
}
impl<'a, T> Dimensions for [T; 3] {
default fn dimensions_core(self, answers: &mut Vec<usize>) {
answers.push(3)
}
}
impl<'a, T> Dimensions for [T; 3]
where T: Dimensions + Copy
{
fn dimensions_core(self, answers: &mut Vec<usize>) {
answers.push(3);
self[0].dimensions_core(answers);
}
}
// Also implement for all the other sizes of array as well as `Vec`
fn main() {
let x = [[1, 2, 3], [4, 5, 6]];
println!("{:?}", x.dimensions());
let x = [[1, 2], [3, 4], [5, 6]];
println!("{:?}", x.dimensions());
}
It has the obvious downside that you still have to implement the trait for each array size in order to get specialization to kick in.
I'm guessing that you are coming from a language that is highly dynamic. Different languages have different strengths and weaknesses. In Rust, you know your input types, so there's no way the function wouldn't know the nesting of my type. If it's going to receive a Vec<T> or a Vec<&[Vec<T>]>, I will know the depth of nesting ahead of time, so I can write a function that returns the lengths of each one:
fn depth3<A, B, C, T>(a: A) -> [usize; 3]
where A: AsRef<[B]>,
B: AsRef<[C]>,
C: AsRef<[T]>
{
let a = a.as_ref();
// All of these should check that the length is > 1
// and possibly that all children have same length
let b = a[0].as_ref();
let c = b[0].as_ref();
[a.len(), b.len(), c.len()]
}
fn main() {
let x = [[[1], [2], [3]], [[4], [5], [6]]];
println!("{:?}", depth3(&x));
}
This function is as generic as I think it can be - you pass in references to arrays, slices, vectors, or direct values for those types. In fact, I can't think of a way to even define a slice/vector/array with an unknown depth. I think to do something like that you'd have to introduce some new type (likely an enum) with some indirection so that you could have a non-infinite size.
An array is defined as [T], T can't be both [U; 2] and [U; 3]. This means that you wouldn't even be able to get past compilation with this.
If you instead used a Vec<Vec<T>> as #Shepmaster hints, you could do something like this.
fn main() {
let x = vec![vec![1, 2, 3], vec![4, 5]];
println!("{:?}", get_2d_dimension(&x));
}
fn get_2d_dimension<T>(arr: &[Vec<T>]) -> Result<(usize, usize), &str> {
let rows = arr.len();
if rows <= 1 {
return Err("Not 2d");
}
let cols = arr[0].len();
if arr.iter().skip(1).filter(|v| v.len() == cols).count() != rows - 1 {
Err("Not square.")
} else {
Ok((rows, cols))
}
}
As others have noted, finding the dimensions of a "vanilla" nested list is impossible. However, you can choose to implement a custom nested list data structure, like so:
#[derive(Clone, Debug)]
pub enum NestedList<S>
where S: Clone
{
Integer(S),
Array(Vec<NestedList<S>>)
}
Then you'd have to rewrite your nested list using NestedList:
use NestedList::Integer as i;
use NestedList::Array as a;
fn main() {
let array = a(vec![
a(vec![i(1), i(2), i(3)]),
a(vec![i(4), i(5), i(6)])
]);
}
from which you can find the dimensions. Here is an implementation of this method. It is very verbose, but I hope this is what you were looking for?