Flask-Mongoengine custom order-by comparator - python-3.x

I am using Flask-Mongoengine as an ORM for my flask application. I define a Document to store unique versions in a string:
MyDocument(db.Document):
version = db.StringField(primary_key=True, required=True)
The value stored in the StringField is in the following format:
a.b.c.d
I run into issues when it comes to using order_by for my queries:
MyDocument.objects.order_by('version')
More precisely, if the first part of the version (a in the format example above) contains multiple digits, it doesn't sort properly (e.g: 15.20.142 < 1.7.9). While I could obviously do something like:
tmp = Document.objects.all()
result = some_merge_sort(tmp)
However, that requires writing, testing and maintaining a sorting algorithm. This then brings me to my next solution, which is to overwrite the comparator used in order_by. The issue is, I'm not sure how to do this properly. Do I have to define a custom attribute type, or is there some hook I'm supposed to override?
In any case, I thought I would as the good folks on stackoverflow before I go into re-writing the wheel.

After doing some digging into the Mongoengine source code, I found out that it uses the PyMongo cursor sort which itself wraps the queries made to MongoDB. In other words, my Document definition is too high in the chain to even affect the order_by result.
As such, the solution I went with was to write a classmethod. This simply takes in a mongoengine.queryset.QuerySet instance and applies a sort on the version attribute.
from mongoengine.queryset import QuerySet
from packaging import version
class MyDocument(db.Document):
version = db.StringField(primary_key=True, required=True)
#classmethod
def order_by_version(_, queryset, reverse=False):
"""
Wrapper function to order a given queryset of this classes
version attribute.
:params:
- `QuerySet` :queryset: - The queryset to order the elements.
- `bool` :reverse: - If the results should be reversed.
:raises:
- `TypeError` if the instance of the given queryset is not a `QuerySet`.
:returns:
- `List` containing the elements sorted by the version attribute.
"""
# Instance check the queryset.
if not isinstance(queryset, QuerySet):
raise TypeError("order_by_version requires a QuerySet instance!")
# Sort by the version attribute.
return sorted(queryset, key=lambda obj: version.parse(obj.version), reverse=reverse)
Then to utilize this method, I can simply do as follows:
MyDocument.order_by_version(MyDocument.objects)
As for my reason for taking the QuerySet instance rather than doing the query from within the classmethod, it was simply to allow the opportunity to call other Mongoengine methods prior to doing the ordering. Because I used sorted, it will maintain any existing order, so I can still use the built-in order_by for other attributes.

Related

django: simulate a flags Field using BInaryField

So I'm trying to simulate a flags field in Django (4.0 and Python3) the same way I could do in C or C++. It would look like this:
typedef enum{
flagA = 0,
flagB,
flagC
} myFlags;
Having a uint8 that by default is 00000000 and then depending on if the flags are on or off I'd do some bitwise operations to turn the three least significant bits to 1 or 0.
Now, I could do that in my model by simply declaring a PositiveSmallIntegerField or BinaryField and just creating some helper functions to manage all this logic.
Note that I DO NOT NEED to be able to query by this field. I just want to be able to store it in the DB and very occasionally modify it.
Since it's possible to extend the Fields, I was wondering if it would be cleaner to encapsulate all this logic inside a custom Field inheriting from BinaryField. But I'm not really sure how can I manipulate the Field value from my custom class.
class CustomBinaryField(models.BinaryField):
description = "whatever"
def __init__(self, *args, **kwargs):
kwargs['max_length'] = 1
super().__init__(*args, **kwargs)
For instance, if I wanted to create a method inside CustomBinaryField, like the following, where the myFlagsStr contains a str representation of the enum.
def getActiveFlags(self):
// For each bit which is set to 1 in the Binary value
// add it to an array with it's name such as: [flagA, flagC]
array = []
if self.value & (1 << myFlags.flagA):
array.append(myFlagsStr[flagA])
if self.value & (1 << myFlags.flagB):
array.append(myFlagsStr[flagB])
if self.value & (1 << myFlags.flagC):
array.append(myFlagsStr[flagC])
return array
Not sure how to get the actual value stored in the DB to make this if comparisons.
Maybe mine is not the best approach to handle this, so I'm open to any suggestions you guys might have. But I think I could manage to do this the way I'm doing if I knew how to get the actual binary value from the DB from my functions.
I have seen there is a library https://github.com/disqus/django-bitfield that handles this but it limits to using only PostgreSQL and also, as mentioned before, I don't really need to filter by these flags, so something more simpler will do too.
Well, in django common approach for building such functionalities is using MultipleChoiceField. It presumes that data is stored in the related table, which, I believe, is not very what you want.
The second opportunity is to use ArrayField which also isn't suitable for you since you don't want your solution to be limited to PostgreSQL.
If you're going to do this quickly and straightforward, you might use JSONField and store the string or numeric IDs of your Choices. But if you are accustomed to C++, you're not gonna like it this way :)
JSONField is supported on MariaDB 10.2.7+, MySQL 5.7.8+, Oracle, PostgreSQL, and SQLite (with the JSON1 extension enabled).
If so, you should look at SmallIntegerField, it's stored as 16-bit signed int and use getter-setter approach to maintain it, like this. The idea of implementation of the methods you suggested is right in general.
Good luck :)

Contribution analyses - tagged.database

I need to get the single contribution of the processes and emissions I filled into my database - similar to this problem : Brightway2 - Get LCA scores of immediate exchanges
it works for single methods but i was wondering how to get these results for several methods similar to when doing the ordinary calculations which can then be saved as csv? is there a way to create a loop for this?
Thank you so much!
Miriam
There is a function called multi_traverse_tagged_database in bw2analyzer which should do what you need. It was part of a pull request so it's not in the docs.
I've copied in the docstring at the bottom which should give you some pointers. It's basically the same as the traverse_tagged_database function used in the question you've linked to, but for multiple methods. You'd use it like this:
results, graph = multi_traverse_tagged_databases(functional_unit, list_of_methods, label='name')
You should be able to use pandas to export the dictionary you get in results to a csv file.
def multi_traverse_tagged_databases(
functional_unit, methods, label="tag", default_tag="other", secondary_tags=[]
):
"""Traverse a functional unit throughout its foreground database(s), and
group impacts (for multiple methods) by tag label.
Input arguments:
* ``functional_unit``: A functional unit dictionary, e.g. ``{("foo", "bar"): 42}``.
* ``methods``: A list of method names, e.g. ``[("foo", "bar"), ("baz", "qux"), ...]``
* ``label``: The label of the tag classifier. Default is ``"tag"``
* ``default_tag``: The tag classifier to use if none was given. Default is ``"other"``
* ``secondary_tags``: List of tuples in the format (secondary_label, secondary_default_tag). Default is empty list.
Returns:
Aggregated tags dictionary from ``aggregate_tagged_graph``, and tagged supply chain graph from ``recurse_tagged_database``.
"""

Need Help creating class hierarchy in Python

I have a hierarchy of data that i would like to build using classes instead of hard coding it in. The structure is like so:
Unit (has name, abbreviation, subsystems[5 different types of subsystems])
Subsystem ( has type, block diagram(photo), ParameterModel[20 different sets of parameterModels])
ParameterModel (30 or so parameters that will have [parameter name, value, units, and model index])
I'm not sure how to do this using classes but what i have made kindof work so far is creating nested dictionaries.
{'Unit':{'Unit1':{'Subsystem':{'Generator':{Parameter:{'Name': param1, 'Value':1, 'Units': 'seconds'}
like this but with 10-15 units and 5-6 subsystems and 30 or so parameters per subsystem. I know using dictionaries is not the best way to go about it but i cannot figure out the class sharing structure or where to start on building the class structure.
I want to be able to create, read, update and delete, parameters in a tkinter gui that i have built as well as export/import these system parameters and do calculations on them. I can handle the calculations and the import export but i need to create classes that will build out this structure and be able to reference each individual unit/subsystem/parameter/value/etc
I know thats alot but any advice? ive been looking into the factory and abstract factory patterns in hope to try and figure out how to create the code structure but to no avail. I have experience with matlab, visual basic, c++, and various arduio projects so i know most basic programming but this inheritance class structure is something i cannot figure out how to do in an abstract way without hardcoding each parameter with giant names like Unit1_Generator_parameterName_parameter = ____ and i really dont want to do that.
Thanks,
-A
EDIT: Here is one way I've done the implementation using a dictionary but i would like to do this using a class that can take a list and make a bunch of empty attributes and have those be editable/callable generally like setParamValue(unit, susystem, param) where i can pass the unit the subsystem and then the parameter such as 'Td' and then be able to change the value of the key,value pair within this hierarchy.
def create_keys(list):
dict = {key: None for key in list}
return dict
unit_list = ['FL','ES','NN','SF','CC','HD','ND','TH'] #unit abbreviation
sub_list = ['Gen','Gov','Exc','PSS','Rel','BlkD']
params_GENROU = ["T'do","T''do","T'qo","T''qo",'H','D','Xd','Xq',"Xd'","Xq'","X''d=X''q",'Xl','S(1.0)','S(1.2)','Ra'] #parameter names
dict = create_keys(unit_list)
for key in dict:
dict[key] = create_keys(sub_list)
dict[key]['Gen'] = create_keys(params_GENROU)
and inside each dict[unit][Gen][ParamNames] there should be a dict containing Value, units(seconds,degrees,etc), description and CON(#basically in index for another program we use)

Is there a pandas filter that allows any value? [duplicate]

I have discovered the pandas DataFrame.query method and it almost does exactly what I needed it to (and implemented my own parser for, since I hadn't realized it existed but really I should be using the standard method).
I would like my users to be able to specify the query in a configuration file. The syntax seems intuitive enough that I can expect my non-programmer (but engineer) users to figure it out.
There's just one thing missing: a way to select everything in the dataframe. Sometimes what my users want to use is every row, so they would put 'All' or something into that configuration option. In fact, that will be the default option.
I tried df.query('True') but that raised a KeyError. I tried df.query('1') but that returned the row with index 1. The empty string raised a ValueError.
The only things I can think of are 1) put an if clause every time I need to do this type of query (probably 3 or 4 times in the code) or 2) subclass DataFrame and either reimplement query, or add a query_with_all method:
import pandas as pd
class MyDataFrame(pd.DataFrame):
def query_with_all(self, query_string):
if query_string.lower() == 'all':
return self
else:
return self.query(query_string)
And then use my own class every time instead of the pandas one. Is this the only way to do this?
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('#ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method

Builder with optional source

I want to make a Builder with one or more optional sources.
I tried this:
env.Append(BUILDERS = {'my_builder': Builder(action = Action(do_something))})
def do_something(target, source, env):
if source[1]:
do_optional_stuff(source[1])
do_other_stuff(target, source[0])
...
env.my_builder(target.txt, [source1, None]) # Fails
env.my_builder(target.txt, [source2, source3]) # Okay
The trouble is, I get 'NoneType' object has no attribute 'get_ninfo' when I pass in None, because scons is expecting Node arguments, and None isn't acceptable.
Is there anything I can do?
Edit:
As noted in the answer below, it's possible to solve this for the simple case of one optional argument, by varying the length of source list. This doesn't work for making arbitrary arguments optional, so I'd still be interested in a way to do that.
Instead of adding a bogus element, check the length of the source list (or better yet, iterate over the list starting after the first element):
def do_something(target, source, env):
if len(source) > 1:
do_optional_stuff(source[1])
# or:
# for opt_src in source[1:]:
# do_optional_stuff(opt_src)
do_main_stuff(target, source[0])
env.Append(BUILDERS = {'my_builder': Builder(action = Action(do_something))})
env.my_builder('target-2.txt', ['foo.txt'])
env.my_builder('target-1.txt', ['foo.txt', 'bar.txt'])
One issue with this approach is that you need to ensure that your sources are listed in the right order. Depending on the details of what you're doing, you might be able to filter the source list by matching file names or extensions. After all, this is Python code, you have the full power of the language at your disposal.

Resources