Do nothing and throw an error if condition is not met - rust

At the moment I'm running a panic! macro every time I the extension is not met. But this is not how it should be, I want it to just throw an error and/or skip and do nothing. Is there a way to throw an error instead and/or skip it.
impl FileMetaData {
fn new(path: &str) -> FileMetaData {
FileMetaData {
name: FileMetaData::get_file_name(&path).to_string(),
directory: FileMetaData::is_directory(&path),
path: path.to_string(),
}
}
fn get_file_name(path: &str) -> &str {
let file_path = Path::new(path);
let file_name_os_str = file_path.file_stem().unwrap();
if !FileMetaData::is_directory(path) {
if file_path.extension().unwrap().to_str().unwrap() != "heic" {
// TO DO: Change to Err or skip if file not supported
// TO DO: Possible change the whole flow of converting the image
panic!("File format not supported");
}
}
return file_name_os_str.to_str().unwrap()
}
fn is_directory(path: &str) -> bool {
Path::new(&path).is_dir()
}
}
I have 2 use case on the FileMetaData struct since I'm accepting 2 kind of path, a file and a directory.
For file, it's easier since I can just throw an error and it would exit.
But for directory, I need it to not exit if there is a unsupported format detected.
# file
let file = FileMetaData::new(&file_path);
# directory
let entries = fs::read_dir(&path)?
.map(|res| res.map(|e| e.path()))
.collect::<Result<Vec<_>, io::Error>>()?;
for entry in entries {
let file_path = file.to_str().unwrap();
# this should just print an error and continue for the
# other entries in list
let file_metadata = FileMetaData::new(&file_path);
}

You can use continue to skip to the next iteration of a loop.
Method 1: Using let-else
for entry in entries {
let Ok(data) = might_fail() else {
eprintln!("An error occurred!");
continue;
};
}
Downsides: Err is not exposed to the error handling block.
Method 2: Using match
for entry in entries {
let data = match might_fail() {
Ok(data) => data,
Err(e) => {
eprintln!("An error occurred: {e}");
continue;
}
};
}

Related

How do I loop with a function that doesn't take a reference?

I wanted to know if it's possible to loop{} and call a function inside the loop that takes a parameter that is not a reference without copying it.
I am using the crate posix_mq, and I want to open an existing queue, if it not exists wait 1 second and try to open it again.
Here is my code:
pub fn open(&mut self) -> Result<(), posix_mq::error::Error> {
let mut attempt = self.tools_.get_max_attempt(); //30
let queue_name: Name = Name::new(&self.queue_name_)?;
loop {
match Queue::open(queue_name) {
Ok(q) => {
self.my_queue_ = Some(q);
Ok::<(),posix_mq::error::Error>(());
}
Err(e) => match e {
posix_mq::error::Error::QueueNotFound() => {
let waiting_print = "Waiting for creation of the queue.".to_string();
self.tools_.update_printing_elements(waiting_print, false);
attempt -= 1;
if attempt <= 1 {
return Err(e);
}
thread::sleep(time::Duration::from_secs(1));
},
_ => {Err::<(), posix_mq::error::Error>(e);},
}
}
}
}
I want that Queue::open() borrows queue_name without creating a copy of it instead of taking ownership of it.

How call function on Err arm

I wrote this code and it works. But I have a custom class for outputting logs to stdout. I want to call my logging function if I get an Error instead of panic! macro.
fn create_logfile() -> File {
let timestamp = Log::format_date(chrono::Local::now());
let filename = format!("{}.log", timestamp);
let logfile = match File::create(filename) {
Ok(file) => file,
Err(error) => {
panic!("There was a problem creating the file: {:?}", error)
}
};
logfile
}
For example I want get something like that:
let logfile = match File::create(filename) {
Ok(file) => file,
Err(e) => {
Log::error("Log file creation failed, reason: {}", e);
process::exit(1)
}
};
But compiler says:
[E0308] `match` arms have incompatible types.
[Note] expected struct `File`, found `()
How can I solve this problem?
If I put the error data to stderr will it help?
Your revised example with std::process::exit() works for me: (link)
use chrono; // 0.4.23
use std::fs::File;
use log; // 0.4.17
struct Log { }
impl Log {
fn format_date(date: chrono::DateTime<chrono::offset::Local>) -> i64 {
return 0;
}
}
fn old_create_logfile() -> File {
let timestamp = Log::format_date(chrono::Local::now());
let filename = format!("{}.log", timestamp);
let logfile = match File::create(filename) {
Ok(file) => file,
Err(error) => {
panic!("There was a problem creating the file: {:?}", error)
}
};
logfile
}
fn new_create_logfile() -> File {
let timestamp = Log::format_date(chrono::Local::now());
let filename = format!("{}.log", timestamp);
let logfile = match File::create(filename) {
Ok(file) => file,
Err(e) => {
// Instead of using `Log::error`, we'll use `log::error!` for show.
log::error!("Log file creation failed, reason: {}", e);
// This compiles.
std::process::exit(1)
}
};
logfile
}
fn main() {
new_create_logfile();
}
Normally, you need to make sure the return types of all match arms have the same type -- here, you are returning std::fs::File under the Ok branch, so the Err branch can "escape" the requirement by returning ! (pronounced "never") (link).
Since computation never returns from the std::process::exit() and its return type is marked as !, it's passes the type-checking stage.

Rust with Datafusion - Trying to Write DataFrame to Json

*Repo w/ WIP code: https://github.com/jmelm93/rust-datafusion-csv-processing
Started programming with Rust 2 days ago, and have been trying to resolve this since ~3 hours into trying out Rust...
Any help would be appreciated.
My goal is to write a Dataframe from Datafusion to JSON (which will eventually be used to respond to HTTP requests in an API with the JSON string).
The DataFrame turns into an "datafusion::arrow::record_batch::RecordBatch" when you collect the data, and this data type is what I'm having trouble converting.
I've tried -
Using json::writer::record_batches_to_json_rows from Arrow, but it won't let me due to "struct datafusion::arrow::record_batch::RecordBatch and struct arrow::record_batch::RecordBatch have similar names, but are actually distinct types". Haven't been able to successfully convert the types to avoid this.
I tried during the Record Batch into a vec and pull out the headers and the values individually. I was able to get the headers out, but haven't had success with the values.
let mut header = Vec::new();
// let mut rows = Vec::new();
for record_batch in data_vec {
// get data
println!("record_batch.columns: : {:?}", record_batch.columns());
for col in record_batch.columns() {
for row in 0..col.len() {
// println!("Cow: {:?}", col);
// println!("Row: {:?}", row);
// let value = col.as_any().downcast_ref::<StringArray>().unwrap().value(row);
// rows.push(value);
}
}
// get headers
for field in record_batch.schema().fields() {
header.push(field.name().to_string());
}
};
Anyone know how to accomplish this?
The full script is below:
// datafusion examples: https://github.com/apache/arrow-datafusion/tree/master/datafusion-examples/examples
// datafusion docs: https://arrow.apache.org/datafusion/
use datafusion::prelude::*;
use datafusion::arrow::datatypes::{Schema};
use arrow::json;
// use serde::{ Deserialize };
use serde_json::to_string;
use std::sync::Arc;
use std::str;
use std::fs;
use std::ops::Deref;
type DFResult = Result<Arc<DataFrame>, datafusion::error::DataFusionError>;
struct FinalObject {
schema: Schema,
// columns: Vec<Column>,
num_rows: usize,
num_columns: usize,
}
// to allow debug logging for FinalObject
impl std::fmt::Debug for FinalObject {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
// write!(f, "FinalObject {{ schema: {:?}, columns: {:?}, num_rows: {:?}, num_columns: {:?} }}",
write!(f, "FinalObject {{ schema: {:?}, num_rows: {:?}, num_columns: {:?} }}",
// self.schema, self.columns, self.num_columns, self.num_rows)
self.schema, self.num_columns, self.num_rows)
}
}
fn create_or_delete_csv_file(path: String, content: Option<String>, operation: &str) {
match operation {
"create" => {
match content {
Some(c) => fs::write(path, c.as_bytes()).expect("Problem with writing file!"),
None => println!("The content is None, no file will be created"),
}
}
"delete" => {
// Delete the csv file
fs::remove_file(path).expect("Problem with deleting file!");
}
_ => println!("Invalid operation"),
}
}
async fn read_csv_file_with_inferred_schema(file_name_string: String) -> DFResult {
// create string csv data
let csv_data_string = "heading,value\nbasic,1\ncsv,2\nhere,3".to_string();
// Create a temporary file
create_or_delete_csv_file(file_name_string.clone(), Some(csv_data_string), "create");
// Create a session context
let ctx = SessionContext::new();
// Register a lazy DataFrame using the context
let df = ctx.read_csv(file_name_string.clone(), CsvReadOptions::default()).await.expect("An error occurred while reading the CSV string");
// return the dataframe
Ok(Arc::new(df))
}
#[tokio::main]
async fn main() {
let file_name_string = "temp_file.csv".to_string();
let arc_csv_df = read_csv_file_with_inferred_schema(file_name_string.clone()).await.expect("An error occurred while reading the CSV string (funct: read_csv_file_with_inferred_schema)");
// have to use ".clone()" each time I want to use this ref
let deref_df = arc_csv_df.deref();
// print to console
deref_df.clone().show().await.expect("An error occurred while showing the CSV DataFrame");
// collect to vec
let record_batches = deref_df.clone().collect().await.expect("An error occurred while collecting the CSV DataFrame");
// println!("Data: {:?}", data);
// record_batches == <Vec<RecordBatch>>. Convert to RecordBatch
let record_batch = record_batches[0].clone();
// let json_string = to_string(&record_batch).unwrap();
// let mut writer = datafusion::json::writer::RecordBatchJsonWriter::new(vec![]);
// writer.write(&record_batch).unwrap();
// let json_rows = writer.finish();
let json_rows = json::writer::record_batches_to_json_rows(&[record_batch]);
println!("JSON: {:?}", json_rows);
// get final values from recordbatch
// https://docs.rs/arrow/latest/arrow/record_batch/struct.RecordBatch.html
// https://users.rust-lang.org/t/how-to-use-recordbatch-in-arrow-when-using-datafusion/70057/2
// https://github.com/apache/arrow-rs/blob/6.5.0/arrow/src/util/pretty.rs
// let record_batches_vec = record_batches.to_vec();
let mut header = Vec::new();
// let mut rows = Vec::new();
for record_batch in data_vec {
// get data
println!("record_batch.columns: : {:?}", record_batch.columns());
for col in record_batch.columns() {
for row in 0..col.len() {
// println!("Cow: {:?}", col);
// println!("Row: {:?}", row);
// let value = col.as_any().downcast_ref::<StringArray>().unwrap().value(row);
// rows.push(value);
}
}
// get headers
for field in record_batch.schema().fields() {
header.push(field.name().to_string());
}
};
// println!("Header: {:?}", header);
// Delete temp csv
create_or_delete_csv_file(file_name_string.clone(), None, "delete");
}
I am not sure that Datafusion is the perfect place to convert CSV string into JSON string, however here is a working version of your code:
#[tokio::main]
async fn main() {
let file_name_string = "temp_file.csv".to_string();
let csv_data_string = "heading,value\nbasic,1\ncsv,2\nhere,3".to_string();
// Create a temporary file
create_or_delete_csv_file(file_name_string.clone(), Some(csv_data_string), "create");
// Create a session context
let ctx = SessionContext::new();
// Register the csv file
ctx.register_csv("t1", &file_name_string, CsvReadOptions::new().has_header(false))
.await.unwrap();
let df = ctx.sql("SELECT * FROM t1").await.unwrap();
// collect to vec
let record_batches = df.collect().await.unwrap();
// get json rows
let json_rows = datafusion::arrow::json::writer::record_batches_to_json_rows(&record_batches[..]).unwrap();
println!("JSON: {:?}", json_rows);
// Delete temp csv
create_or_delete_csv_file(file_name_string.clone(), None, "delete");
}
If you encounter arrow and datafusion struct conflicts, use datafusion::arrow instead of just the arrow library.

How to handle String.parse<_> ParseIntError [rust]

I finished converting the Rust books guessing game example to an Iced GUI application and wanted to handle error handling for the input from the user.
I have a String trying to convert to an i32 and am not sure how to handle the error if the user puts a String in the text_input or just hits return. I figured out a simple solution of:
self.hidden_compare = self.user_guess_text.trim().parse::<i32>().unwrap_or(0);
Rather than having self.hidden_compare default to 0. I would rather have self.user_guess_text default to a String I have set to earlier in the application and am unsure of how to accomplish this still being fairly new.
Edit: Full function added for clarification.
fn update(&mut self, message: Message) -> Command<Message> {
match message {
Message::BtnGuessNow => {
self.hidden_compare = self.user_guess_text.trim().parse::<i32>().unwrap_or(0);
if self.hidden_value == self.hidden_compare {
self.label_compare = String::from("A WINNER!");
self.number_ofguesses.push(self.hidden_compare.to_string() + ", A WINNER!");
}
else if self.hidden_value > self.hidden_compare {
self.label_compare = String::from("Too Low");
self.number_ofguesses.push(self.hidden_compare.to_string() + ", Too Low!");
}
else if self.hidden_value < self.hidden_compare {
self.label_compare = String::from("Too Big");
self.number_ofguesses.push(self.hidden_compare.to_string()+ ", Too Big!");
}
self.user_guess_text = "".to_string();
Command::none()
}
Message::UserInputValueUpdate(x) => {
self.user_guess_text = x;
Command::none()
}
}
and a relevant function that handles the Vec output:
fn guess_output_calc(&self) -> String {
let mut tempoutput = String::new();
for (i, x) in self.number_ofguesses.iter().enumerate().skip(1) {
let guessfmt = String::from(format!("Guess# {} Was: {}\n", i, x));
tempoutput.push_str(&guessfmt);
};
return
I would rather have self.user_guess_text default to a String
I'm not entirely sure what you mean with that, but I interpret this as "I want to set the self.user_guess_text variable to a specific value if it can't be converted to an integer". If this is wrong, then please update your question.
This is how I would approach this (simplified):
fn main() {
let mut user_guess_text = " a ";
match user_guess_text.trim().parse::<i32>() {
Ok(value) => {
println!("Parsed to value: {}", value);
}
Err(_) => {
println!("Unable to parse. Resetting variable.");
user_guess_text = "fallback text!";
}
}
println!("user_guess_text: {}", user_guess_text);
}
Unable to parse. Resetting variable.
user_guess_text: default text!

Struct property accessable from method but not from outside

I'm trying to build a basic web crawler in Rust, which I'm trying to port to html5ever. As of right now, I have a function with a struct inside that is supposed to return a Vec<String>. It gets this Vec from the struct in the return statement. Why does it always return an empty vector? (Does it have anything to do with the lifetime parameters?)
fn find_urls_in_html<'a>(
original_url: &Url,
raw_html: String,
fetched_cache: &Vec<String>,
) -> Vec<String> {
#[derive(Clone)]
struct Sink<'a> {
original_url: &'a Url,
returned_vec: Vec<String>,
fetched_cache: &'a Vec<String>,
}
impl<'a> TokenSink for Sink<'a> {
type Handle = ();
fn process_token(&mut self, token: Token, _line_number: u64) -> TokenSinkResult<()> {
trace!("token {:?}", token);
match token {
TagToken(tag) => {
if tag.kind == StartTag && tag.attrs.len() != 0 {
let _attribute_name = get_attribute_for_elem(&tag.name);
if _attribute_name == None {
return TokenSinkResult::Continue;
}
let attribute_name = _attribute_name.unwrap();
for attribute in &tag.attrs {
if &attribute.name.local != attribute_name {
continue;
}
trace!("element {:?} found", tag);
add_urls_to_vec(
repair_suggested_url(
self.original_url,
(&attribute.name.local, &attribute.value),
),
&mut self.returned_vec,
&self.fetched_cache,
);
}
}
}
ParseError(error) => {
warn!("error parsing html for {}: {:?}", self.original_url, error);
}
_ => {}
}
return TokenSinkResult::Continue;
}
}
let html = Sink {
original_url: original_url,
returned_vec: Vec::new(),
fetched_cache: fetched_cache,
};
let mut byte_tendril = ByteTendril::new();
{
let tendril_push_result = byte_tendril.try_push_bytes(&raw_html.into_bytes());
if tendril_push_result.is_err() {
warn!("error pushing bytes to tendril: {:?}", tendril_push_result);
return Vec::new();
}
}
let mut queue = BufferQueue::new();
queue.push_back(byte_tendril.try_reinterpret().unwrap());
let mut tok = Tokenizer::new(html.clone(), std::default::Default::default()); // default default! default?
let feed = tok.feed(&mut queue);
return html.returned_vec;
}
The output ends with no warning (and a panic, caused by another function due to this being empty). Can anyone help me figure out what's going on?
Thanks in advance.
When I initialize the Tokenizer, I use:
let mut tok = Tokenizer::new(html.clone(), std::default::Default::default());
The problem is that I'm telling the Tokenizer to use html.clone() instead of html. As such, it is writing returned_vec to the cloned object, not html. Changing a few things, such as using a variable with mutable references, fixes this problem.

Resources