Lazy join multiple DataFrames on a Categorical - rust

trying to implement the SAMPLE of Lazy join multiple DataFrames on a Categorical:
use polars::prelude::*;
fn lazy_example(mut df_a: LazyFrame, mut df_b: LazyFrame) -> Result<DataFrame> {
let q1 = df_a.with_columns(vec![
col("a").cast(DataType::Categorical),
]);
let q2 = df_b.with_columns(vec![
col("b").cast(DataType::Categorical)
]);
q1.inner_join(q2, col("a"), col("b"), None).collect()
}
getting an error:
error[E0308]: mismatched types
--> src\main.rs:6:23
|
6 | col("a").cast(DataType::Categorical),
| ---- ^^^^^^^^^^^^^^^^^^^^^ expected enum `polars::prelude::DataType`, found fn item
| |
| arguments to this function are incorrect
|
= note: expected enum `polars::prelude::DataType`
found fn item `fn(Option<Arc<RevMapping>>) -> polars::prelude::DataType {polars::prelude::DataType::Categorical}`
note: associated function defined here
--> C:\Users\rnio\.cargo\registry\src\github.com-1ecc6299db9ec823\polars-lazy-0.23.1\src\dsl\mod.rs:555:12
|
555 | pub fn cast(self, data_type: DataType) -> Self {
| ^^^^
help: use parentheses to instantiate this tuple variant
|
6 | col("a").cast(DataType::Categorical(_)),
| +++
applied the suggested fix:
col("a").cast(DataType::Categorical()),
col("b").cast(DataType::Categorical()),
get next error:
error[E0061]: this enum variant takes 1 argument but 0 arguments were supplied
--> src\main.rs:7:23
|
7 | col("a").cast(DataType::Categorical()),
| ^^^^^^^^^^^^^^^^^^^^^-- an argument of type `Option<Arc<RevMapping>>` is missing
|
note: tuple variant defined here
--> C:\Users\rnio\.cargo\registry\src\github.com-1ecc6299db9ec823\polars-core-0.23.1\src\datatypes\mod.rs:707:5
|
707 | Categorical(Option<Arc<RevMapping>>),
| ^^^^^^^^^^^
help: provide the argument
|
7 | col("a").cast(DataType::Categorical(/* Option<Arc<RevMapping>> */)),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
So its missing an argument for Categorial() ... even though it will not be used:
// The RevMapping has the internal state.
This is ignored with casts, comparisons, hashing etc.
https://docs.rs/polars/latest/polars/datatypes/enum.RevMapping.html
Any idea how to fix this?
Thanks

Thanks to #Dogbert :)
here is the working code:
fn lazy_example(mut df_a: LazyFrame, mut df_b: LazyFrame) -> Result<DataFrame> {
let q1 = df_a.with_columns(vec![
col("a").cast(DataType::Categorical(None)),
]);
let q2 = df_b.with_columns(vec![
col("b").cast(DataType::Categorical(None))
]);
q1.inner_join(q2, col("a"), col("b")).collect()
}

Related

Rust Polars Series compare to scalar not working

I am trying to figure out, why the sample code does not work?
This is in my toml file:
polars = "*"
This is the sample from Polars Eager cookbook:
use polars::prelude::*;
fn main() {
let s = Series::new("a", &[1, 2, 3]);
let ca = UInt32Chunked::new("b", &[Some(3), None, Some(1)]);
println!("{:?}", s.eq(2));
println!("{:?}", ca.eq(2));
}
It looks like the "eq" function is not properly overloaded?! I am getting the following errors:
error[E0308]: mismatched types
--> src\main.rs:7:27
|
7 | println!("{:?}", s.eq(2));
| -- ^ expected `&polars::prelude::Series`, found integer
| |
| arguments to this function are incorrect
|
note: associated function defined here
--> C:\Users\rnio\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib/rustlib/src/rust\library\core\src\cmp.rs:228:8
|
228 | fn eq(&self, other: &Rhs) -> bool;
| ^^
error[E0599]: `ChunkedArray<UInt32Type>` is not an iterator
--> src\main.rs:9:25
|
9 | println!("{:?}", ca.eq(2));
| ^^ `ChunkedArray<UInt32Type>` is not an iterator
|
::: C:\Users\rnio\.cargo\registry\src\github.com-1ecc6299db9ec823\polars-core-0.22.7\src\chunked_array\mod.rs:143:1
|
143 | pub struct ChunkedArray<T> {
| -------------------------- doesn't satisfy `ChunkedArray<UInt32Type>: Iterator`
|
= note: the following trait bounds were not satisfied:
`ChunkedArray<UInt32Type>: Iterator`
which is required by `&mut ChunkedArray<UInt32Type>: Iterator`
Thanks to #isaactfa ... the current workaround is to convert the Series to a ChunckedArray before comparisons.
Here is a working code:
use polars::prelude::*;
fn main() {
let s = Series::new("a", &[1, 2, 3]);
let ca = UInt32Chunked::new("b", &[Some(3), None, Some(1)]);
println!("{:?}", s.i32().unwrap().equal(2));
println!("{:?}", ca.equal(3));
}

How to deny int variable without explicit type annotation in Rust?

I often use integer type without type annotation:
let mut counter = 0
If there is no constraint, rustc will infer x is a i32. (rfcs/0212-restore-int-fallback.md at master ยท rust-lang/rfcs).
Sometimes it causes a problem: overflowing.
for _ in 0..1_000_000_000_000usize {
counter += 1; # overflow!
...
}
So I want to tell rustc to deny integer variable without explicit type annotation:
let mut counter = 0; # Deny
let mut counter: u64 = 0; # OK, because `counter` is explicitly annotated with `u64`
How to do this?
There is a new lint default_numeric_fallback:
#[warn(clippy::default_numeric_fallback)]
pub fn is_yelled(s: &str) -> bool {
let (n_upper, n_lower) = s.chars().fold((0, 0), |(n_upper, n_lower), c| {
if c.is_uppercase() {
(n_upper + 1, n_lower)
} else if c.is_lowercase() {
(n_upper, n_lower + 1)
} else {
(n_upper, n_lower)
}
});
n_upper > n_lower
}
will output:
warning: default numeric fallback might occur
--> src/lib.rs:4:46
|
4 | let (n_upper, n_lower) = s.chars().fold((0, 0), |(n_upper, n_lower), c| {
| ^ help: consider adding suffix: `0_i32`
|
note: the lint level is defined here
--> src/lib.rs:1:8
|
1 | #[warn(clippy::default_numeric_fallback)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#default_numeric_fallback
warning: default numeric fallback might occur
--> src/lib.rs:4:49
|
4 | let (n_upper, n_lower) = s.chars().fold((0, 0), |(n_upper, n_lower), c| {
| ^ help: consider adding suffix: `0_i32`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#default_numeric_fallback
warning: default numeric fallback might occur
--> src/lib.rs:6:24
|
6 | (n_upper + 1, n_lower)
| ^ help: consider adding suffix: `1_i32`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#default_numeric_fallback
warning: default numeric fallback might occur
--> src/lib.rs:8:33
|
8 | (n_upper, n_lower + 1)
| ^ help: consider adding suffix: `1_i32`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#default_numeric_fallback
Unfortunately the lint is not warn for now but only in pedantic list. And it's strange to suggest i32 as default. For example, this code sample should use usize.
Source: https://github.com/rust-lang/rust-clippy/issues/6064

Unable to mock a trait returning Option<&String>

I am trying to mock a trait using the mockall crate:
#[automock]
trait Foo {
fn foo(input: &Vec<String>) -> Option<&String>;
}
However, I get the following error:
error[E0637]: `&` without an explicit lifetime name cannot be used here
--> src/names_matcher.rs:79:51
|
79 | fn foo(input: &Vec<String>) -> Option<&String>;
| ^ explicit lifetime name needed here
error[E0623]: lifetime mismatch
--> src/names_matcher.rs:77:9
|
77 | #[automock]
| ^^^^^^^^^^^
| |
| ...but data from `input` is returned here
78 | trait Foo {
79 | fn foo(input: &Vec<String>) -> Option<&String>;
| ------------ this parameter and the return type are declared with different lifetimes...
The function I want to implement will return either None or Some with a reference to one of the elements of the vector in input. If I try to define the lifetimes taking this into account:
#[automock]
trait Foo {
fn foo<'r>(input: &'r Vec<String>) -> Option<&'r String>;
}
I get the following:
error[E0261]: use of undeclared lifetime name `'r`
--> src/names_matcher.rs:79:20
|
79 | fn foo<'r>(input: &'r Vec<String>) -> Option<&'r String>;
| ^^ undeclared lifetime
|
help: consider introducing lifetime `'r` here
|
77 | #[automock]<'r>
| ^^^^
help: consider introducing lifetime `'r` here
|
77 | 'r, #[automock]
| ^^^
But none of the suggestions work, they produce syntax errors. Is there a way to mock a trait like the one I defined above?

Why can't I pass a String from env::Args to Path::new?

Consider the following example:
use std::env;
use std::path::Path;
fn main() {
let args: Vec<_> = env::args().collect();
let out_path: String = args[2];
let _path = Path::new(out_path);
}
Here's the error I'm getting while compiling:
error[E0308]: mismatched types
--> main.rs:8:27
|
8 | let _path = Path::new(out_path);
| ^^^^^^^^
| |
| expected reference, found struct `std::string::String`
| help: consider borrowing here: `&out_path`
|
= note: expected type `&_`
found type `std::string::String`
Now if I follow compiler's suggestion, I get this:
error[E0507]: cannot move out of indexed content
--> main.rs:7:28
|
7 | let out_path: String = args[2];
| ^^^^^^^
| |
| cannot move out of indexed content
| help: consider using a reference instead: `&args[2]`
error: aborting due to previous error
Which, after applying the suggestion, leads me to the previous error:
error[E0308]: mismatched types
--> main.rs:7:28
|
7 | let out_path: String = &args[2];
| ^^^^^^^^
| |
| expected struct `std::string::String`, found reference
| help: consider removing the borrow: `args[2]`
|
= note: expected type `std::string::String`
found type `&std::string::String`
How can I understand the situation and solve the problem?
This was indeed an unfortunate sequence of suggestions (use a reference > remove that reference), but this was caused by the manual type ascription related to out_path.
You want a string slice, not an owned String:
let out_path: &str = &args[2];
This fits both the restriction of args (you can't move out of indexed content) and the requirements of Path::new, which requires a reference.
As for your comment, a clone() "fixes" the cannot move out of indexed content error because it doesn't require a move from the args vector - it copies an element from it instead. This fix is of course inferior to just borrowing it, which also works with Path::new.

How to write a Rust function that returns a reference to the inner member of an Rc? [duplicate]

This question already has answers here:
How do I borrow a RefCell<HashMap>, find a key, and return a reference to the result? [duplicate]
(1 answer)
Why can't I store a value and a reference to that value in the same struct?
(4 answers)
Closed 5 years ago.
My goal is to write a function that will take an Rc<RefCell<&'a mut [u8]>> as an argument and return a struct that contains a reference to the inner slice, but I can't satisfy the borrow checker. My first attempt looked like this:
pub fn mk_buf_holder<'a>(buf: Rc<RefCell<&'a mut [u8]>>) -> BufHolder<'a> {
BufHolder::<'a> { buf: buf.borrow_mut().deref_mut()}
}
But that doesn't work, of course, because the result of borrow_mut goes out of scope. My next attempt added extra members to the struct, just to keep around values that would otherwise be temporary. I thought that putting them into the struct would give them the same lifetime as buf, but the borrow checker disagrees. Here's the full example:
use std::cell::{Ref, RefCell, RefMut};
use std::ops::DerefMut;
use std::rc::Rc;
pub struct BufHolder<'a> {
buf: &'a mut [u8],
bufclone: Rc<RefCell<&'a mut [u8]>>,
bufref: RefMut<'a, &'a mut[u8]>
}
pub fn mk_buf_holder<'a>(buf: Rc<RefCell<&'a mut [u8]>>) -> BufHolder<'a> {
let bufclone = buf.clone();
let bufref = bufclone.borrow_mut();
BufHolder::<'a> { bufclone: bufclone,
buf: bufref.deref_mut(),
bufref: bufref }
}
The compiler still tells me that the result of borrow_mut() doesn't live long enough, even though I added it to the output structure. It's as if the compiler is copying into the output, instead of moving it. How can I fix this function?
error: `bufclone` does not live long enough
--> src/main.rs:13:18
|
13 | let bufref = bufclone.borrow_mut();
| ^^^^^^^^ does not live long enough
...
19 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the body at 11:74...
--> src/main.rs:11:75
|
11 | pub fn mk_buf_holder<'a>(buf: Rc<RefCell<&'a mut [u8]>>) -> BufHolder<'a> {
| ___________________________________________________________________________^ starting here...
12 | | let bufclone = buf.clone();
13 | | let bufref = bufclone.borrow_mut();
14 | | BufHolder::<'a> {
15 | | bufclone: bufclone,
16 | | buf: bufref.deref_mut(),
17 | | bufref: bufref,
18 | | }
19 | | }
| |_^ ...ending here
error: `bufref` does not live long enough
--> src/main.rs:16:14
|
16 | buf: bufref.deref_mut(),
| ^^^^^^ does not live long enough
...
19 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the body at 11:74...
--> src/main.rs:11:75
|
11 | pub fn mk_buf_holder<'a>(buf: Rc<RefCell<&'a mut [u8]>>) -> BufHolder<'a> {
| ___________________________________________________________________________^ starting here...
12 | | let bufclone = buf.clone();
13 | | let bufref = bufclone.borrow_mut();
14 | | BufHolder::<'a> {
15 | | bufclone: bufclone,
16 | | buf: bufref.deref_mut(),
17 | | bufref: bufref,
18 | | }
19 | | }
| |_^ ...ending here

Resources