How to get a Vec from polars Series or ChunkedArray? - rust

In Rust Polars, how to cast a Series or ChunkedArray to a Vec?

You can collect the values into a Vec.
use polars::prelude::*;
fn main() -> Result<()> {
let s = Series::new("a", 0..10i32);
let as_vec: Vec<Option<i32>> = s.i32()?.into_iter().collect();
// if we are certain we don't have missing values
let as_vec: Vec<i32> = s.i32()?.into_no_null_iter().collect();
Ok(())
}

Related

Rust How to modify a polars DataFrame in a function, so that the caller see the changes?

I am lost at the mutable references ... Trying to send a DataFrame into a function ... change it and see the changes after the function call completes ...
I get error:
cannot borrow as mutable
Here is a code sample:
use polars::prelude::*;
use std::ops::DerefMut;
fn main() {
let mut days = df!(
"date_string" => &["1900-01-01", "1900-01-02", "1900-01-03", "1900-01-04", "1900-01-05",
"1900-01-06", "1900-01-07", "1900-01-09", "1900-01-10"])
.unwrap();
change(&mut days);
println!("{:?}", days);
}
fn change(days: &mut DataFrame) {
days.column("date_string").unwrap().rename("DATE-STRING)");
}
The signature of column is
fn column(&self, name: &str) -> Result<&Series, PolarsError>
It returns a shared reference to a column. DataFrame has its own rename method that you should use:
use polars::df;
use polars::prelude::*;
fn main() {
let mut days = df!(
"date_string" => &["1900-01-01", "1900-01-02", "1900-01-03", "1900-01-04", "1900-01-05",
"1900-01-06", "1900-01-07", "1900-01-09", "1900-01-10"])
.unwrap();
change(&mut days).unwrap();
assert_eq!(days.get_column_names(), &["DATE-STRING"]);
}
fn change(days: &mut DataFrame) -> Result<&mut DataFrame> {
days.rename("date_string", "DATE-STRING")
}

Using ndarray to create a time series in rust

I was wondering how would I create a time series Array from CSV using ndarray ?
I have this CSV:
date,value
1959-07-02,0.2930
1959-07-06,0.2910
1959-07-07,0.2820
1959-07-08,0.2846
1959-07-09,0.2760
1959-07-10,0.2757
That I'd like to plot using plotly-rs with ndarray support. I deserialized the CSV successfully, but I know want to know how can I create two Array objects: one with dates as NaiveDate (or String as I'm not sure that plotly-rs supports NaiveData natively), and another with values as f64 ? Below is my deserialization code:
#[derive(Deserialize)]
struct Record {
#[serde(deserialize_with = "naive_date_time_from_str")]
date: NaiveDate,
value: f64
}
fn naive_date_time_from_str<'de, D>(deserializer: D) -> Result<NaiveDate, D::Error>
where
D: Deserializer<'de>,
{
let s: String = Deserialize::deserialize(deserializer)?;
NaiveDate::parse_from_str(&s, "%Y-%m-%d").map_err(de::Error::custom)
}
And I can iterate through the CSV like this:
fn main() -> Result<(), Box<dyn Error>> {
let mut reader = ReaderBuilder::new()
.has_headers(true)
.delimiter(b',')
.from_path("./data/timeseries.csv")?;
for record in reader.deserialize::<Record>() {
let record: Record = record?;
println!(
"date {}, value = {}",
record.date.format("%Y-%m-%d").to_string(),
record.value
);
}
Ok(())
}
But know I'm stuck at creating two ndarray Array object. Any hints ?
EDIT: A somewhat similar approach would be done in this topic (but without using ndarray): How to push data from a csv::StringRecord to each column vector in a struct?
You can directly read csv data and plot a chart without additional ndarray step.
use csv::Error;
use plotly::{Plot, Scatter};
fn main() -> Result<(), Error> {
let csv = "date,value
1959-07-02,0.2930
1959-07-06,0.2910
1959-07-07,0.2820
1959-07-08,0.2846
1959-07-09,0.2760
1959-07-10,0.2757";
let mut reader = csv::Reader::from_reader(csv.as_bytes());
let mut date = vec![];
let mut data = vec![];
for record in reader.records() {
let record = record?;
date.push(record[0].to_string());
data.push(record[1].to_string());
}
let trace = Scatter::new(date, data);
let mut plot = Plot::new();
plot.add_trace(trace);
plot.show();
Ok(())
}

Sort then dedup a rust vector of integer in one line

I want to sort then dedup a vector of i32 in rust:
fn main() {
let mut vec = vec![5,5,3,3,4,4,2,1,2,1];
vec.sort().collect().dedup();
println!("{:?}", vec);
}
This code does not work, but if the dedup part is done this way, it's fine:
vec.sort();
vec.dedup();
How do I do sort and dedup in one line in my example ?
With itertools, you can do:
use itertools::Itertools;
fn main() {
let mut vec = vec![5,5,3,3,4,4,2,1,2,1];
vec = vec.into_iter().sorted().dedup().collect();
println!("{:?}", vec);
}
It really depeonds on why you think you need a one-liner though. If it's actually because you need a single expression then you can use a block:
vec = {
vec.sort();
vec.dedup();
vec
};

How do I correctly pass this vector of file lines to my function in Rust?

I'm trying to read a file into a vector, then print out a random line from that vector.
What am I doing wrong?
I'm asking here because I know I'm making a big conceptual mistake, but I'm having trouble identifying exactly where it is.
I know the error -
error[E0308]: mismatched types
26 | processor(&lines)
| ^^^^^^ expected &str, found struct std::string::String
And I see that there's a mismatch - but I don't know how to give the right type, or refactor the code for that (very short) function.
My code is below:
use std::{
fs::File,
io::{prelude::*, BufReader},
path::Path,
};
fn lines_from_file(filename: impl AsRef<Path>) -> Vec<String> {
let file = File::open(filename).expect("no such file");
let buf = BufReader::new(file);
buf.lines()
.map(|l| l.expect("Could not parse line"))
.collect()
}
fn processor(vectr: &Vec<&str>) -> () {
let vec = vectr;
let index = (rand::random::<f32>() * vec.len() as f32).floor() as usize;
println!("{}", vectr[index]);
}
fn main() {
let lines = lines_from_file("./example.txt");
for line in lines {
println!("{:?}", line);
}
processor(&lines);
}
While you're calling the processor function you're trying to pass a Vec<String> which is what the lines_from_file returns but the processor is expecting a &Vec<&str>. You can change the processor to match that expectation:
fn processor(vectr: &Vec<String>) -> () {
let vec = vectr;
let index = (rand::random::<f32>() * vec.len() as f32).floor() as usize;
println!("{}", vectr[index]);
}
The main function:
fn main() {
let lines = lines_from_file("./example.txt");
for line in &lines {. // &lines to avoid moving the variable
println!("{:?}", line);
}
processor(&lines);
}
More generally, a String is not the same as a string slice &str, therefore Vec<String> is not the same as Vec<&str>. I'd recommend checking the rust book: https://doc.rust-lang.org/nightly/book/ch04-03-slices.html?highlight=String#string-slices

String join on strings in Vec in reverse order without a `collect`

I'm trying to join strings in a vector into a single string, in reverse from their order in the vector. The following works:
let v = vec!["a".to_string(), "b".to_string(), "c".to_string()];
v.iter().rev().map(|s| s.clone()).collect::<Vec<String>>().connect(".")
However, this ends up creating a temporary vector that I don't actually need. Is it possible to do this without a collect? I see that connect is a StrVector method. Is there nothing for raw iterators?
I believe this is the shortest you can get:
fn main() {
let v = vec!["a".to_string(), "b".to_string(), "c".to_string()];
let mut r = v.iter()
.rev()
.fold(String::new(), |r, c| r + c.as_str() + ".");
r.pop();
println!("{}", r);
}
The addition operation on String takes its left operand by value and pushes the second operand in-place, which is very nice - it does not cause any reallocations. You don't even need to clone() the contained strings.
I think, however, that the lack of concat()/connect() methods on iterators is a serious drawback. It bit me a lot too.
I don't know if they've heard our Stack Overflow prayers or what, but the itertools crate happens to have just the method you need - join.
With it, your example might be laid out as follows:
use itertools::Itertools;
let v = ["a", "b", "c"];
let connected = v.iter().rev().join(".");
Here's an iterator extension trait that I whipped up, just for you!
pub trait InterleaveExt: Iterator + Sized {
fn interleave(self, value: Self::Item) -> Interleave<Self> {
Interleave {
iter: self.peekable(),
value: value,
me_next: false,
}
}
}
impl<I: Iterator> InterleaveExt for I {}
pub struct Interleave<I>
where
I: Iterator,
{
iter: std::iter::Peekable<I>,
value: I::Item,
me_next: bool,
}
impl<I> Iterator for Interleave<I>
where
I: Iterator,
I::Item: Clone,
{
type Item = I::Item;
#[inline]
fn next(&mut self) -> Option<Self::Item> {
// Don't return a value if there's no next item
if let None = self.iter.peek() {
return None;
}
let next = if self.me_next {
Some(self.value.clone())
} else {
self.iter.next()
};
self.me_next = !self.me_next;
next
}
}
It can be called like so:
fn main() {
let a = &["a", "b", "c"];
let s: String = a.iter().cloned().rev().interleave(".").collect();
println!("{}", s);
let v = vec!["a".to_string(), "b".to_string(), "c".to_string()];
let s: String = v.iter().map(|s| s.as_str()).rev().interleave(".").collect();
println!("{}", s);
}
I've since learned that this iterator adapter already exists in itertools under the name intersperse — go use that instead!.
Cheating answer
You never said you needed the original vector after this, so we can reverse it in place and just use join...
let mut v = vec!["a".to_string(), "b".to_string(), "c".to_string()];
v.reverse();
println!("{}", v.join("."))

Resources