Contribution analyses - tagged.database - brightway

I need to get the single contribution of the processes and emissions I filled into my database - similar to this problem : Brightway2 - Get LCA scores of immediate exchanges
it works for single methods but i was wondering how to get these results for several methods similar to when doing the ordinary calculations which can then be saved as csv? is there a way to create a loop for this?
Thank you so much!
Miriam

There is a function called multi_traverse_tagged_database in bw2analyzer which should do what you need. It was part of a pull request so it's not in the docs.
I've copied in the docstring at the bottom which should give you some pointers. It's basically the same as the traverse_tagged_database function used in the question you've linked to, but for multiple methods. You'd use it like this:
results, graph = multi_traverse_tagged_databases(functional_unit, list_of_methods, label='name')
You should be able to use pandas to export the dictionary you get in results to a csv file.
def multi_traverse_tagged_databases(
functional_unit, methods, label="tag", default_tag="other", secondary_tags=[]
):
"""Traverse a functional unit throughout its foreground database(s), and
group impacts (for multiple methods) by tag label.
Input arguments:
* ``functional_unit``: A functional unit dictionary, e.g. ``{("foo", "bar"): 42}``.
* ``methods``: A list of method names, e.g. ``[("foo", "bar"), ("baz", "qux"), ...]``
* ``label``: The label of the tag classifier. Default is ``"tag"``
* ``default_tag``: The tag classifier to use if none was given. Default is ``"other"``
* ``secondary_tags``: List of tuples in the format (secondary_label, secondary_default_tag). Default is empty list.
Returns:
Aggregated tags dictionary from ``aggregate_tagged_graph``, and tagged supply chain graph from ``recurse_tagged_database``.
"""

Related

What does this python function signature means in Kedro Tutorial?

I am looking at Kedro Library as my team are looking into using it for our data pipeline.
While going to the offical tutorial - Spaceflight.
I came across this function:
def preprocess_companies(companies: pd.DataFrame) -> pd.DataFrame:
"""Preprocess the data for companies.
Args:
companies: Source data.
Returns:
Preprocessed data.
"""
companies["iata_approved"] = companies["iata_approved"].apply(_is_true)
companies["company_rating"] = companies["company_rating"].apply(_parse_percentage)
return companies
companies is the name of the csv file containing the data
Looking at the function, my assumption is that (companies: pd.Dafarame) is the shorthand to read the "companies" dataset as a dataframe. If so, I do not understand what does -> pd.Dataframe at the end means
I tried looking at python documentation regarding such style of code but I did not managed to find any
Much help is appreciated to assist me in understanding this.
Thank you
This is tht way of declaring type of your inputs(companies: pd.DataFrame) . Here comapnies is argument and pd.DataFrame is its type . in same way -> pd.DataFrame this is the type of output
Overall they are saying that comapnies of type pd.DataFrame will return pd.DataFrametype variable .
I hope you got it
The -> notation is type hinting, as is the : part in the companies: pd.DataFrame function definition. This is not essential to do in Python but many people like to include it. The function definition would work exactly the same if it didn't contain this but instead read:
def preprocess_companies(companies):
This is a general Python thing rather than anything kedro-specific.
The way that kedro registers companies as a kedro dataset is completely separate from this function definition and is done through the catalog.yml file:
companies:
type: pandas.CSVDataSet
filepath: data/01_raw/companies.csv
There will then a node defined (in pipeline.py) to specify that the preprocess_companies function should take as input the kedro dataset companies:
node(
func=preprocess_companies,
inputs="companies", # THIS LINE REFERS TO THE DATASET NAME
outputs="preprocessed_companies",
name="preprocessing_companies",
),
In theory the name of the parameter in the function itself could be completely different, e.g.
def preprocess_companies(anything_you_want):
... although it is very common to give it the same name as the dataset.
In this situation companies is technically any DataFrame. However, when wrapped in a Kedro Node object the correct dataset will be passed in:
Node(
func=preprocess_companies, # The function posted above
inputs='raw_companies', # Kedro will read from a catalog entry called 'raw companies'
outputs='processed_companies', # Kedro will write to a catalog entry called 'processed_companies'
)
In essence the parameter name isn't really important here, it has been named this way so that the person reading the code knows that it is semantically about companies, but the function name does that too.
The above is technically a simplification since I'm not getting into MemoryDataSets but hopefully it covers the main points.

How to find most similar to an array in gensim

I know the most_similar method works when entering a previously added string, but how do you reverse search a numpy array of some word?
modelw2v = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz',binary=True)
differenceArr = modelw2v["King"] - modelw2v["Queen"]
# This line does not work
modelw2v.most_similar(differenceArr)
The most_similar() method can take vectors as the origin of a search, but you should explicitly specify them as one member of a list provided to the method's positive parameter, so that its logic for handling more simple origins (like a string or list of strings) isn't confused.
Specifically, this should work with your other code:
model23v.most_similar(positive=[differenceArr,])
More generally, you can supply lists of vectors (or word-keys for looking up vectors) to both the positive and negative parameters of this method, and the method will combine them (according to the exact logic you can see in the source code). So for example the prominent word2vec example...
wv('king') - wv('man') + wv('woman') = ?
...can be effected with the most_similar() method without doing your own other vector-arithmetic:
sims = modelw2v.most_similar(positive=['king', 'woman'], negative=['man'])

Flask-Mongoengine custom order-by comparator

I am using Flask-Mongoengine as an ORM for my flask application. I define a Document to store unique versions in a string:
MyDocument(db.Document):
version = db.StringField(primary_key=True, required=True)
The value stored in the StringField is in the following format:
a.b.c.d
I run into issues when it comes to using order_by for my queries:
MyDocument.objects.order_by('version')
More precisely, if the first part of the version (a in the format example above) contains multiple digits, it doesn't sort properly (e.g: 15.20.142 < 1.7.9). While I could obviously do something like:
tmp = Document.objects.all()
result = some_merge_sort(tmp)
However, that requires writing, testing and maintaining a sorting algorithm. This then brings me to my next solution, which is to overwrite the comparator used in order_by. The issue is, I'm not sure how to do this properly. Do I have to define a custom attribute type, or is there some hook I'm supposed to override?
In any case, I thought I would as the good folks on stackoverflow before I go into re-writing the wheel.
After doing some digging into the Mongoengine source code, I found out that it uses the PyMongo cursor sort which itself wraps the queries made to MongoDB. In other words, my Document definition is too high in the chain to even affect the order_by result.
As such, the solution I went with was to write a classmethod. This simply takes in a mongoengine.queryset.QuerySet instance and applies a sort on the version attribute.
from mongoengine.queryset import QuerySet
from packaging import version
class MyDocument(db.Document):
version = db.StringField(primary_key=True, required=True)
#classmethod
def order_by_version(_, queryset, reverse=False):
"""
Wrapper function to order a given queryset of this classes
version attribute.
:params:
- `QuerySet` :queryset: - The queryset to order the elements.
- `bool` :reverse: - If the results should be reversed.
:raises:
- `TypeError` if the instance of the given queryset is not a `QuerySet`.
:returns:
- `List` containing the elements sorted by the version attribute.
"""
# Instance check the queryset.
if not isinstance(queryset, QuerySet):
raise TypeError("order_by_version requires a QuerySet instance!")
# Sort by the version attribute.
return sorted(queryset, key=lambda obj: version.parse(obj.version), reverse=reverse)
Then to utilize this method, I can simply do as follows:
MyDocument.order_by_version(MyDocument.objects)
As for my reason for taking the QuerySet instance rather than doing the query from within the classmethod, it was simply to allow the opportunity to call other Mongoengine methods prior to doing the ordering. Because I used sorted, it will maintain any existing order, so I can still use the built-in order_by for other attributes.

How to initilise a list that contains custom functions without python running those functions during initialisation?

Short version:
How do you store functions in a list and only have them be executed when they are called using their index position in the list?
Long Version:
So I am writing a program that rolls a user-chosen number of six-sided dice, stores the results in a list and then organizes the results/ data in a dictionary.
After the data is gathered the program gives the user options from 0-2 to choose from and asks the user to type a number corresponding to the option they want.
After this input by the user, a variable, lets say TT, is assigned to it. I want the program to use TT to identify which function to run that is contained within a list called "Executable_options" by using TT as the index posistion of this function within the list.
The problem I am having is that I have to have the list that contains the functions on a line after the functions have been defined and when I initialize the list it goes through and executes all functions within it in order when I don't want it to. I just want them to be in the list for calling at a later date.
I tried to initialise the list without any functions in and then append the functions individually, but every time a function is appened to the list it is also executed.
def results():
def Rolling_thunder():
def roll_again():
The functions contains things, but is unnecessary to show for the question at hand
Executable_options = []
Executable_options.append(results())
Executable_options.append(Rolling_thunder())
Executable_options.append(roll_again)
options = len(Executable_options)
I am relatively new to Python so I am still getting my head around it. I have tried searching for the answer to this on existing posts, but couldn't find anything so I assume I am just using the wrong key words in my search.
Thank you very much for taking the time to read this and for the answers provided.
Edit: Code now works
The () on the end of the function name calls it - i.e. results() is the call to the results method.
Simply append to the list without the call - i.e:
Executable_options.append(results)
You can then call it by doing e.g.:
Executable_options[0]()
as per your given data the code will look like this:
def results():
def Rolling_thunder():
def roll_again():
Executable_options = []
Executable_options.append(results)
Executable_options.append(Rolling_thunder)
Executable_options.append(roll_again)
for i in range(0,len(Executable_options)):
Executable_options[i]()
this will work for you.

Need Help creating class hierarchy in Python

I have a hierarchy of data that i would like to build using classes instead of hard coding it in. The structure is like so:
Unit (has name, abbreviation, subsystems[5 different types of subsystems])
Subsystem ( has type, block diagram(photo), ParameterModel[20 different sets of parameterModels])
ParameterModel (30 or so parameters that will have [parameter name, value, units, and model index])
I'm not sure how to do this using classes but what i have made kindof work so far is creating nested dictionaries.
{'Unit':{'Unit1':{'Subsystem':{'Generator':{Parameter:{'Name': param1, 'Value':1, 'Units': 'seconds'}
like this but with 10-15 units and 5-6 subsystems and 30 or so parameters per subsystem. I know using dictionaries is not the best way to go about it but i cannot figure out the class sharing structure or where to start on building the class structure.
I want to be able to create, read, update and delete, parameters in a tkinter gui that i have built as well as export/import these system parameters and do calculations on them. I can handle the calculations and the import export but i need to create classes that will build out this structure and be able to reference each individual unit/subsystem/parameter/value/etc
I know thats alot but any advice? ive been looking into the factory and abstract factory patterns in hope to try and figure out how to create the code structure but to no avail. I have experience with matlab, visual basic, c++, and various arduio projects so i know most basic programming but this inheritance class structure is something i cannot figure out how to do in an abstract way without hardcoding each parameter with giant names like Unit1_Generator_parameterName_parameter = ____ and i really dont want to do that.
Thanks,
-A
EDIT: Here is one way I've done the implementation using a dictionary but i would like to do this using a class that can take a list and make a bunch of empty attributes and have those be editable/callable generally like setParamValue(unit, susystem, param) where i can pass the unit the subsystem and then the parameter such as 'Td' and then be able to change the value of the key,value pair within this hierarchy.
def create_keys(list):
dict = {key: None for key in list}
return dict
unit_list = ['FL','ES','NN','SF','CC','HD','ND','TH'] #unit abbreviation
sub_list = ['Gen','Gov','Exc','PSS','Rel','BlkD']
params_GENROU = ["T'do","T''do","T'qo","T''qo",'H','D','Xd','Xq',"Xd'","Xq'","X''d=X''q",'Xl','S(1.0)','S(1.2)','Ra'] #parameter names
dict = create_keys(unit_list)
for key in dict:
dict[key] = create_keys(sub_list)
dict[key]['Gen'] = create_keys(params_GENROU)
and inside each dict[unit][Gen][ParamNames] there should be a dict containing Value, units(seconds,degrees,etc), description and CON(#basically in index for another program we use)

Resources