Rust Polars feature select in Cargo.toml not working - rust

I have a strange behavior. Apparently I must have messed up something:
My toml file:
[package]
name = "test"
version = "0.1.0"
edition = "2021"
[dependencies]
# Version 0.22.7 ==> works
polars = {version = "0.22.8", features = ["lazy"]}
# Version 0.23.0 ==> Does Not Work ... and it will load the 0.23.2 version?!
#polars = {version = "0.23.0", features = ["lazy"]}
Main main.cs:
use polars::prelude::*;
pub fn main() {
let path = "C:\\temp\\rusty.csv";
let days = LazyCsvReader::new(path.into())
.has_header(false)
.finish()
.unwrap()
.collect();
}
Error:
error[E0433]: failed to resolve: use of undeclared type `LazyCsvReader`
--> src\main.rs:25:16
|
25 | let days = LazyCsvReader::new(path.into())
| ^^^^^^^^^^^^^ use of undeclared type `LazyCsvReader`
Any ideas ...?
Digging further I can see that part of the feature-tree is missing in version 0.23.2 of polars:
│ ├── polars feature "csv-file"
│ │ ├── polars v0.22.8 (*)
│ │ ├── polars feature "polars-io"
│ │ │ └── polars v0.22.8 (*)
│ │ ├── polars feature "polars-lazy"
│ │ │ └── polars v0.22.8 (*)
│ │ ├── polars-io feature "csv-file" (*)
│ │ └── polars-lazy feature "csv-file"
│ │ ├── polars-lazy v0.22.7 (*)
│ │ └── polars-io feature "csv-file" (*)
==> a BUG?
Version 0.23.1 of polars is feature complete ... does not have this problem
now my workaround question is: How do I force a specific version to be part of my project?
This:
polars = {version = "0.23.1", features = ["lazy"]}
did not work ...

Thanks to #isaactfa we have this workaround/solution:
polars = {version = "0.23.2", features = ["lazy", "csv-file"]}
My understanding is that the "csv-file" feature is a dependency feature of "lazy" and thus should have been loaded with just the "lazy" flag.
The other workaround is to really force polars' version to "<= 0.23.1"
polars = {version = "<= 0.23.1", features = ["lazy"]}

Related

Copy subfolders from one location to another that has the same structure but only overwrite the new versions

I have 2 locations with folders and multiple subfolders as follows:
Location 1
2022_10
├── FolderA
│ └── Version 3
│ └── FolderA.docx
├── FolderB
│ └── Version 2
| └── FolderB.docx
├── FolderC
│ └── Version 2
│ └── FolderC.docx
├── FolderD
│ └── Version 3
└── FolderD.docx
Location 2
2022_10
├── FolderA
│ └── Version 1
│ └── FolderA.docx
│ └── Version 2
│ └── FolderA.docx
├── FolderB
│ └── Version 1
| └── FolderB.docx
├── FolderC
│ └── Version 1
│ └── FolderC.docx
├── FolderD
│ └── Version 1
│ └── FolderA.docx
│ └── Version 2
│ └── FolderA.docx
Location 1 has the latest version of subfolders that I need to copy to location 2 (centralized repository), but respecting the folder structure and the previous folder versions that already exist there.
At the end, my objective is to have location 2 ingesting only the latest version as follows, if the version already exist the script should overwrite the latest version from Location 1 into Location 2.
Location 2
2022_10
├── FolderA
│ └── Version 1
│ └── FolderA.docx
│ └── Version 2
│ └── FolderA.docx
│ └── Version 3
│ └── FolderA.docx
├── FolderB
│ └── Version 1
| └── FolderB.docx
│ └── Version 2
| └── FolderB.docx
With some help I got a script up and running which replicates the Location 1 structure, that part is done, but now I'm thinking on the best way to have the same script to accomplish the copy between locations as well perhaps using shutil with a multi-option menu: -A for generating location 1 structure, and -C to do the copy operation from Location 1 to Location 2.
Here is the code I have:
#!/usr/bin/env python3
import docx, os, glob, re, shutil, sys
from pathlib import Path
#Taking the folder to process from user input (second argument is considered)
folder = sys.argv[1]
#Function to create a new folder if the path does not exist
def create_dir(path):
is_exist = os.path.exists(path)
if not is_exist:
os.makedirs(path)
for file in glob.glob(os.path.join(folder, '*.docx')):
main_folder = os.path.join(folder,Path(file).stem)
file_name = os.path.basename(file)
#Getting the version information from every word
doc = docx.Document(file).paragraphs[6].text
#Getting the version number line = Version Number: (.*) and extracting the number only portion
version_number = re.search("(Version Number: (.*))", doc).group(1)
version_subfolder = version_number.split(':')[1].strip()
# path to actual sub_folder with version_no
version_subfolder = os.path.join(main_folder, version_subfolder)
# destination path
dest_file_path = os.path.join(version_subfolder, file_name)
for i in [main_folder,version_subfolder]:
create_dir(i) # function call
# to move the file to the corresponding version folder (overwrite if exists)
if os.path.exists(dest_file_path):
os.remove(dest_file_path)
shutil.move(file, version_subfolder)
else:
shutil.move(file, version_subfolder)

Specifying dev-dependencies through Cargo feature flags

The Background
I am working on a Rust crate. It has several [dependencies] and several [dev-dependencies]. Some dev-dependencies are new crates, but some are just enabling features of crates that are already specified as dependencies.
For example, I use serde in my crate, but don't need to derive any serialization impls. However, for my tests I do need to derive them, so my Cargo.toml file looked like this:
[package]
name = "example"
version = "73015087.0.0"
edition = "2021"
[lib]
path = "/dev/null"
[dependencies]
serde = "1.0.139"
[dev-dependencies]
serde = { version = "1.0.139", features = ["derive"] }
I didn't like needing to duplicate the version number where I (or tools) might forget to update it in sync with the other one, so I changed the dev-dependency to use an unbounded version:
[dev-dependencies]
serde = { version = ">=0", features = ["derive"] }
This felt kind-of gross (particularly since we need to work around crates.io forbidding truly unbounded versions with *). I don't want to be specifying a second dependency on serde, I just want to enable a feature. I ended up with the following alternative.
The Idea
Instead of specifying my dev-dependencies directly under [dev-dependencies], I specified a dependency back on the crate itself, but with a feature flag named dev-dependencies enabled. I used the this flag to enable serde's derive feature without needing to duplicate the version number or use an unbounded range:
[package]
name = "example"
version = "73015087.0.0"
edition = "2021"
[lib]
path = "/dev/null"
[dependencies]
serde = "1.0.139"
[dev-dependencies]
example = { path = ".", default-features = false, features = ["dev-dependencies"] }
[features]
dev-dependencies = [
"serde/derive",
]
After playing with this, I realized that I could use this to list all the versions of all types of dependencies under [dependencies], by adding them as optional and enabling them through the new flag. For example, here is how I've added expect-test, instead of putting it under dev-dependencies.
[package]
name = "example"
version = "73015087.0.0"
edition = "2021"
[lib]
path = "/dev/null"
[dependencies]
expect-test = { version = "1.3.0", optional = true }
serde = "1.0.139"
[dev-dependencies]
example = { path = ".", default-features = false, features = ["dev-dependencies"] }
[features]
dev-dependencies = [
"serde/derive",
"expect-test",
]
It's a little weird, but I might prefer to have all of the versions in one place and avoid the potential duplication. I'm deciding whether to stick with it.
The Question
Does this produce any significant differences in behaviour from expressing dev-dependencies the conventional way, or is it equivalent? Am I going to end up with more confusing resolver issues down the line?
Here are the resolutions with and without dev dependencies enabled, according to cargo-tree:
$ cargo tree -e normal
example v73015087.0.0 (/workspaces/example)
└── serde v1.0.139
$ cargo tree -e normal,dev
example v73015087.0.0 (/workspaces/example)
├── expect-test v1.3.0
│ ├── dissimilar v1.0.4
│ └── once_cell v1.13.0
└── serde v1.0.139
└── serde_derive v1.0.139 (proc-macro)
├── proc-macro2 v1.0.40
│ └── unicode-ident v1.0.2
├── quote v1.0.20
│ └── proc-macro2 v1.0.40 (*)
└── syn v1.0.98
├── proc-macro2 v1.0.40 (*)
├── quote v1.0.20 (*)
└── unicode-ident v1.0.2
[dev-dependencies]
└── example v73015087.0.0 (/workspaces/example) (*)

Python 3.9: importlib exec_module() does not execute module

Given the following Python module layout:
app/
├── drivers
│ ├── mydriver
│ │ ├── driver.py
│ │ └── __init__.py
│ └── __init__.py
├── __init__.py
└── main.py
I am trying to dynamically import the "mydriver" module in main.py:
import os
import importlib
driver_dir = os.path.join(os.path.dirname(__file__), 'drivers')
loader_details = (
importlib.machinery.ExtensionFileLoader,
importlib.machinery.EXTENSION_SUFFIXES
)
finder = importlib.machinery.FileFinder(driver_dir, loader_details)
spec = finder.find_spec('mydriver')
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
# The following line produces AttributeError: module 'mydriver' has no attribute 'driver'
driver = getattr(module, 'driver')
drivers/mydriver/__init__.py contains the following:
from . import driver
print("TEST")
So the result is the Attribute error as written in the inline comment. The "print()" from __init__.py is also not being executed.
Any hints why the module is apparently not being evaluated?
While I haven't found a root cause, I did find a (not so pretty) workaround. For some reason, the module can not be executed if it was found using the FileFinder. It does however execute if I do the following:
sys.path.insert(0, driver_dir)
spec = importlib.util.find_spec('mydriver')
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
So in short, I don't know what Python wants from a Finder to also execute modules, not just files. Well, at least I have working code for now...

Can we change something in child modules ".terraform/modules"?

When I run terraform plan I got the following errors:
Error: Reference to undeclared input variable
│
│ on .terraform/modules/ec2-datapipeline/main.tf line 5, in resource "aws_instance" "this":
│ 5: count = "${var.count}"
│
│ An input variable with the name "count" has not been declared. This
│ variable can be declared with a variable "count" {} block.
╵
╷
│ Error: Incorrect attribute value type
│
│ on .terraform/modules/ec2-datapipeline/main.tf line 13, in resource "aws_instance" "this":
│ 13: vpc_security_group_ids = ["${var.vpc_security_group_ids}"]
│ ├────────────────
│ │ var.vpc_security_group_ids is a list of string, known only after apply
│
│ Inappropriate value for attribute "vpc_security_group_ids": element 0:
│ string required.
╵
╷
│ Error: Unsupported argument
│
│ on .terraform/modules/ec2-datapipeline/main.tf line 23, in resource "aws_instance" "this":
│ 23: root_block_device = "${var.root_block_device}"
│
│ An argument named "root_block_device" is not expected here. Did you mean to
│ define a block of type "root_block_device"?
╵
╷
│ Error: Unsupported argument
│
│ on .terraform/modules/ec2-datapipeline/main.tf line 24, in resource "aws_instance" "this":
│ 24: ebs_block_device = "${var.ebs_block_device}"
│
│ An argument named "ebs_block_device" is not expected here. Did you mean to
│ define a block of type "ebs_block_device"?
╵
╷
│ Error: Unsupported argument
│
│ on .terraform/modules/ec2-datapipeline/main.tf line 25, in resource "aws_instance" "this":
│ 25: ephemeral_block_device = "${var.ephemeral_block_device}"
│
│ An argument named "ephemeral_block_device" is not expected here. Did you
│ mean to define a block of type "ephemeral_block_device"?
╵
╷
│ Error: Error in function call
│
│ on .terraform/modules/ec2-datapipeline/main.tf line 36, in resource "aws_instance" "this":
│ 36: tags = "${merge(var.tags, map("Name", format("%s-%d", var.name, count.index+1)))}"
│ ├────────────────
│ │ count.index is a number, known only after apply
│ │ var.name will be known only after apply
│
│ Call to function "map" failed: the "map" function was deprecated in
│ Terraform v0.12 and is no longer available; use tomap({ ... }) syntax to
│ write a literal map.
When I fixed them in .terraform/modules/main.tf and variables.tf files, plan was successful and it went through when I trigger from CLI. But, when I actually push the code when I trigger a plan from UI it doesn't work.
So, when I do terraform init --upgrade. It came back to normal and I can see the same issues.
Could you please help me in solving this? thanks!
The stuff in .terraform/modules/ec2-datapipeline/main.tf is actually the main.tf file from an external module called "ec2-datapipeline" referenced somewhere in your own code (look for module "ec2-datapipeline").
You have to fix the bug in the source for "ec2-datapipeline", commit and tag it, then update the source reference in your own code.
The reason your changes disappear on reinitialization is, that you passed the --upgrade switch which tells Terraform to redownload the external module sources, even if they are cached locally in the .terraform/modules cache.

what's causing terraform error: Call to function "formatlist" failed: error on format iteration 0: unsupported value for "%s" at 5: string required

In my terraform code I have the following locals
locals {
merged_acl_contributors = concat(var.workspace.acl.contributors, azurerm_synapse_workspace.workspace.identity)
contributors = formatlist("user:%s:rwx", local.merged_acl_contributors)
}
var.workspace.acl.contributors does not have a value (just has []). When I try to deploy this I get:
│ Error: Error in function call
│
│ on modules/synapse_v2/main.tf line 10, in locals:
│ 10: contributors = formatlist("user:%s:rwx", local.merged_acl_contributors)
│ ├────────────────
│ │ local.merged_acl_contributors is tuple with 1 element
│
│ Call to function "formatlist" failed: error on format iteration 0:
│ unsupported value for "%s" at 5: string required.
identity in azurerm_synapse_workspace.workspace is a block with multiple attributes. You have to choose what you want. For example:
merged_acl_contributors = concat(var.workspace.acl.contributors, [azurerm_synapse_workspace.workspace.identity.principal_id])
Looking at local.merged_acl_contributors was the answer. The value was wrong. Instead of azurerm_synapse_workspace.identity, it needed to be azurerm_synapse_workspace.managedidentity.

Resources