POLARS Dataframe innerJOIN in RUST - rust

RUST / POLARS nooby question :)
I can not get the "inner_join" to work:
use polars::prelude::*;
use std::fs::File;
use std::path::PathBuf;
use std::env;
fn main() -> std::io::Result<()> {
let mut root = env::current_dir().unwrap();
let file_1 = root.join("data_1.csv");
let file_2 = root.join("data_2.csv");
// Get data from first file (one column data: column_1)
let file = File::open(file_1).expect("Cannot open file.");
let first_data = CsvReader::new(file)
.has_header(false)
.finish()
.unwrap();
// WORKS !
println!("{}", first_data);
// Get data from second file (one column data: column_1)
let file = File::open(file_2).expect("Cannot open file.");
let second_data = CsvReader::new(file)
.has_header(false)
.finish()
.unwrap();
// WORKS !
println!("{}", second_data);
// Trying to get an INNER join
let all_data = first_data.inner_join(second_data, "column_1", "column_1");
println!("{}", all_data);
Ok(())
}
BUILD OUTPUT:
error[E0277]: `&str` is not an iterator
--> src\main.rs:33:31
|
33 | let all_data = first_data.inner_join(second_data, "column_1", "column_1");
| ^^^^^^^^^^ `&str` is not an iterator; try calling `.chars()` or `.bytes()`
|
= help: the trait `Iterator` is not implemented for `&str`
= note: required because of the requirements on the impl of `IntoIterator` for `&str`
note: required by a bound in `hash_join::<impl DataFrame>::inner_join`
--> C:\Users\rnio\.cargo\registry\src\github.com-1ecc6299db9ec823\polars-core-0.23.1\src\frame\hash_join\mod.rs:645:12
|
645 | I: IntoIterator<Item = S>,
| ^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `hash_join::<impl DataFrame>::inner_join`
Looking for any hint / information what I am missing ... I looked at POLARS Features ... and could not see a flag needed to do JOIN operations ... any ideas ?
Thanks in advance :)

The problem is with the columns, not with the DataFrame.
The inner_join function takes in the DataFrame and two sets of columns that implement IntoIterator. Because you are passing in strings for the column names, it's giving you the error telling you to call .chars() to turn it into an iterator over the characters.
You should be able to get this to work with the following:
let all_data = first_data.inner_join(&second_data, ["column_1"], ["column_1"]);
You can see the definition of this function here: https://docs.rs/polars/latest/polars/frame/struct.DataFrame.html#method.inner_join

Related

Rust: Error[E0277]: the trait bound `{integer}: SampleRange<_>` is not satisfied

I have a line of code that is in a for loop, and it's supposed to generate a random number from 0 to 2499. It is giving me problems.
let index = rand::thread_rng().gen_range(2499);
Full code for those who want to know:
fn generate_phrase () -> String {
let mut phrase = String::new();
let mut file = File::open("words.txt").expect("Failed to open words.txt");
let mut contents = String::new();
file.read_to_string(&mut contents).expect("Failed to read words.txt");
let words: Vec<&str> = contents.split("\n").collect();
for _ in 0..8 {
let index = rand::thread_rng().gen_range(2499);
phrase.push_str(words[index]);
phrase.push(' ');
}
println!("Your phrase is: {:?}", phrase);
return phrase;
}
Error message:
error[E0277]: the trait bound `{integer}: SampleRange<_>` is not satisfied
--> src/crypto/crypto.rs:115:45
|
115 | let index = rand::thread_rng().gen_range(2499);
| --------- ^^^^ the trait `SampleRange<_>` is not implemented for `{integer}`
| |
| required by a bound introduced by this call
|
note: required by a bound in `gen_range`
--> C:\Users\Administrator\.cargo\registry\src\github.com-1ecc6299db9ec823\rand-0.8.5\src\rng.rs:132:12
|
132 | R: SampleRange<T>
| ^^^^^^^^^^^^^^ required by this bound in `gen_range
I know the problem, which is that the trait is not the right kind but I don't know how to convert the integer into the necessary trait: SampleRange<T>. I've looked on StackOverFlow and couldn't find an appropriate answer anywhere.
The SampleRange that it complains about can be either a Range or RangeInclusive, rather than just an upper-bound (see the "implementations" section in SampleRange to see which types implement the trait). All you need is to change that one line to look something like this:
let index = rand::thread_rng().gen_range(0..2499);

Why does a variable holding the result of Vec::get_mut not need to be mutable?

I have the following code:
fn main() {
let mut vec = Vec::new();
vec.push(String::from("Foo"));
let mut row = vec.get_mut(0).unwrap();
row.push('!');
println!("{}", vec[0])
}
It prints out "Foo!", but the compiler tells me:
warning: variable does not need to be mutable
--> src/main.rs:4:9
|
4 | let mut row = vec.get_mut(0).unwrap();
| ----^^^
| |
| help: remove this `mut`
Surprisingly, removing the mut works. This raises a few questions:
Why does this work?
Why doesn't this work when I use vec.get instead of vec.get_mut, regardless of whether I use let or let mut?
Why doesn't vec work in the same way, i.e. when I use let vec = Vec::new(), why can't I call vec.push()?
vec.get_mut(0) returns an Option<&mut String>, so when you unwrap that value you will have a mutable borrow of a String. Remember, that a let statement's left side is using pattern matching, so when your pattern is just a variable name you essentially say match whatever is on the right and call it name. Thus row matches against &mut String so it already is mutable.
Here's a much simpler and more straightforward example to illustrate the case (which you can try in the playground):
fn main() {
let mut x = 55i32;
dbg!(&x);
let y = &mut x; // <-- y's type is `&mut i32`
*y = 12;
dbg!(&x);
}

Why can't I collect the Lines iterator into a vector of Strings?

I'm trying to read the lines of a text file into a vector of Strings so I can continually loop over them and write each line to a channel for testing, but the compiler complains about collect:
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
fn main() {
let file = File::open(Path::new("file")).unwrap();
let reader = BufReader::new(&file);
let _: Vec<String> = reader.lines().collect().unwrap();
}
The compiler complains:
error[E0282]: type annotations needed
--> src/main.rs:9:30
|
9 | let lines: Vec<String> = reader.lines().collect().unwrap();
| ^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type for `B`
|
= note: type must be known at this point
Without the .unwrap(), compiler says:
error[E0277]: a collection of type `std::vec::Vec<std::string::String>` cannot be built from an iterator over elements of type `std::result::Result<std::string::String, std::io::Error>`
--> src/main.rs:9:45
|
9 | let lines: Vec<String> = reader.lines().collect();
| ^^^^^^^ a collection of type `std::vec::Vec<std::string::String>` cannot be built from `std::iter::Iterator<Item=std::result::Result<std::string::String, std::io::Error>>`
|
= help: the trait `std::iter::FromIterator<std::result::Result<std::string::String, std::io::Error>>` is not implemented for `std::vec::Vec<std::string::String>`
How do I tell Rust the correct type?
Since you want to collect straight into a Vec<String> while the Lines iterator is over Result<String, std::io::Error>, you need to help the type inference a little bit:
let lines: Vec<String> = reader.lines().collect::<Result<_, _>>().unwrap();
or even just:
let lines: Vec<_> = reader.lines().collect::<Result<_, _>>().unwrap();
This way the compiler knows that there is an intermediate step with a Result<Vec<String>, io::Error>. I think this case could be improved in the future, but for now the type inference is not able to deduce this.

How do I parse a vector into a function?

The idea is to send a set of characters of a vector and let the function display the current correct guesses.
Here is my main:
fn main() {
let mut guessedLetters = vec![];
displayWord(guessedLetters);
}
And here is the function:
fn displayWord(correctGuess: Vec<char>) {
let mut currentWord = String::new();
for x in 0..5 {
currentWord.push(correctGuess[x]);
}
println!("Current guesses: {}", currentWord);
}
I don't know what I'm supposed to write inside the parameters of displayWord.
There's a couple of things wrong with your code.
The first error is pretty straight forward:
--> src/main.rs:38:25
|
38 | displayWord(guessed_Letters);
| ^^^^^^^^^^^^^^^ expected char, found enum `std::option::Option`
|
= note: expected type `std::vec::Vec<char>`
found type `std::vec::Vec<std::option::Option<char>>`
The function you wrote is expecting a vector a characters ... but you're passing it a vector of Option<char>. This is happening here:
guessed_Letters.push(line.chars().nth(0));
According to the documentation, the nth method returns an Option. The quick fix here is to unwrap the Option to get the underlying value:
guessed_Letters.push(line.chars().nth(0).unwrap());
Your next error is:
error[E0382]: use of moved value: `guessed_Letters`
--> src/main.rs:38:25
|
38 | displayWord(guessed_Letters);
| ^^^^^^^^^^^^^^^ value moved here in previous iteration of loop
|
= note: move occurs because `guessed_Letters` has type `std::vec::Vec<char>`, which does not implement the `Copy` trait
This is transferring ownership of the vector on the first iteration of the loop and the compiler is telling you that subsequent iterations would be in violation of Rust's ownership rules.
The solution here is to pass the vector by reference instead:
displayWord(&guessed_Letters);
..and your method should also accept a reference:
fn displayWord(correctGuess: &Vec<char>) {
let mut currentWord = String::new();
for x in 0..5 {
currentWord.push(correctGuess[x]);
}
println!("Current guesses: {}", currentWord);
}
This can be shortened to use a slice and still work:
fn displayWord(correctGuess: &[char]) {

What is a clean way to convert a Result into an Option?

Before updating to a more recent Rust version the following used to work:
fn example(val: &[&str]) {
let parsed_value: Vec<usize> = val
.iter()
.filter_map(|e| e.parse::<usize>())
.collect();
}
However, now the parse method returns a Result type instead of an Option and I get the error:
error[E0308]: mismatched types
--> src/lib.rs:4:25
|
4 | .filter_map(|e| e.parse::<usize>())
| ^^^^^^^^^^^^^^^^^^ expected enum `std::option::Option`, found enum `std::result::Result`
|
= note: expected type `std::option::Option<_>`
found type `std::result::Result<usize, std::num::ParseIntError>`
I could create an Option through a conditional, but is there a better / cleaner way?
Use Result::ok. Types added for clarity:
let res: Result<u8, ()> = Ok(42);
let opt: Option<u8> = res.ok();
println!("{:?}", opt);
For symmetry's sake, there's also Option::ok_or and Option::ok_or_else to go from an Option to a Result.
In your case, you have an iterator.
If you'd like to ignore failures, use Iterator::flat_map. Since Result (and Option) implement IntoIterator, this works:
let parsed_value: Vec<usize> = val
.iter()
.flat_map(|e| e.parse())
.collect();
If you'd like to stop on the first failure, you can collect into one big Result. This is less obvious, but you can check out the implementors of FromIterator for the full list of collect-able items.
let parsed_value: Result<Vec<usize>, _> = val
.iter()
.map(|e| e.parse())
.collect();
Of course, you can then convert the one big Result into an Option, as the first example shows.

Resources