How can I use Brightway2 with US LCI database? - brightway

Short version:
I am trying to upload US LCI database to Brightway2 and I am failing miserably. Has anyone succeeded? If so, could you share it with me? :D
Long version:
I am following the notebook IO - Importing the US LCI database notebook and I am having a lot of problems. I am aware that, as the notebook indicates, it is a work in progress. Anyhow, I wanted to give it a try:
I tried uploading every ecospold version database found here, following the method from the notebook. The only one that gave me a similar results was version FY20.Q3.02. However, right off the bat I get the following differences/errors:
Same as the notebook, I get this error: Couldn't apply strategy link_technosphere_by_activity_hash: Object in source database can't be uniquely linked to target database. And two activities that are linked. When I follow the instructions of ignoring these datasets, it throws me that error over and over again.
Trying to move on with the tutorial, I get more errors and at the end I end up with all exchanges unlinked:
633 datasets
37513 exchanges
37505 unlinked exchanges
Finally, after running the code in line [15]:
import functools
f = functools.partial(link_iterable_by_fields,
other=Database(config.biosphere),
kind='biosphere'
)
sp.apply_strategy(f)
sp.statistics(f)
I end up with:
0 datasets
0 exchanges
0 unlinked exchanges
Which is hilarious and sad at the same time. Since I am new with Python and BW, my troubleshooting is clumpsy and probably erroneous (I promise I googled a lot and went through the code). And concluded I am failing and it is time to ask questions:
Has anybody succeeded uploading the US LCI database to Brightway2?
If so, how? Which file did you use?
Thank you!!!!

This is an excellent question. I have added text to the offending notebook to note that it is obsolete.
In general, I think trying to import the ecospold files is a fools errand, as though they are labeled ecospold2, they are actually ecospold1 (which is a totally different format):
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ecoSpold xmlns="http://www.EcoInvent.org/EcoSpold01">
The most recent export also raises an error when I try the ecospold1 importer:
AttributeError: no such child: {http://www.EcoInvent.org/EcoSpold01}modellingAndValidation
This is a required attribute in ecospold1.
I think the best way forward would be to consume the JSON-LD directly. Note that it is important not to run bw2setup(), as you would also want to use their list of elementary flows and LCIA methods. Currently the experimental JSON-LD importer fails because the provided datasets need allocation, but don't provide a set of consistent allocation methods. When I use the git checkout of bw2io and do the following:
uslci = JSONLDImporter(
"/Users/cmutel/Downloads/National_Renewable_Energy_Laboratory-USLCI_Database/",
"US LCI",
preferred_allocation="CAUSAL_ALLOCATION"
)
uslci.apply_strategies()
I get the following error:
UnallocatableDataset: We currently only support exchange-specific CAUSAL_ALLOCATION
This is fixable, but someone would need to step through this and fix the allocation procedure, and I don't have the time to do that now.

Related

Setting up Visual Studio Code to run models from Hugging Face

I am trying to import models from hugging face and use them in Visual Studio Code.
I installed transformers, tensorflow, and torch.
I have tried looking at multiple tutorials online but have found nothing.
I am trying to run the following code:
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
result = classifier("I hate it when I'm sitting under a tree and an apple hits my head.")
print(result)
However, I get the following error:
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Traceback (most recent call last):
File "c:\Users\user\Desktop\Artificial Intelligence\transformers\Workshops\workshop_3.py", line 4, in <module>
classifier = pipeline('sentiment-analysis')
File "C:\Users\user\Desktop\Artificial Intelligence\transformers\src\transformers\pipelines\__init__.py", line 702, in pipeline
framework, model = infer_framework_load_model(
File "C:\Users\user\Desktop\Artificial Intelligence\transformers\src\transformers\pipelines\base.py", line 266, in infer_framework_load_model
raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
ValueError: Could not load model distilbert-base-uncased-finetuned-sst-2-english with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSequenceClassification'>, <class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForSequenceClassification'>, <class 'transformers.models.distilbert.modeling_distilbert.DistilBertForSequenceClassification'>, <class 'transformers.models.distilbert.modeling_tf_distilbert.TFDistilBertForSequenceClassification'>).
I have already searched online for ways to set up transformers to use in Visual Studio Code but nothing is helping.
Do you know how to fix this error, or if someone knows how to successfully use models from Hugging Face into my code, it would be appreciated?
This question is a little less about Hugging Face itself and likely more about installation and the installation steps you took (and potentially your program's access to the cache file where the models are automatically downloaded to.).
From what I am seeing either:
1/ your program is unable to access the model
2/ your program is throwing specific value errors in a bit of an edge case
If 1/ Take a look here: [https://huggingface.co/docs/transformers/installation#cache-setup][1]
Notice that it the docs walks through where the pre-trained models are downloaded. Check that it was downloaded here: C:\Users\username\.cache\huggingface\hub (of course with your own username on your computer instead. Check in the cache location to make sure it was downloaded? (You can check in the cache locations mentioned.)
Second, if for some reason, there is an issue with downloading, you can try downloading manually and doing it via offline mode (this is more to get it up and running): https://huggingface.co/docs/transformers/installation#offline-mode
Third, if it is downloaded, do you have the right permissions to access the .cache? (Try running your program (if it is a program that you trust) on Windows Terminal as an administrator.). Various ways - find one that you're comfortable with, here are a couple hints from Stackoverflow/StackExchange: Opening up Windows Terminal with elevated privileges, from within Windows Terminal or this: https://superuser.com/questions/1560049/open-windows-terminal-as-admin-with-winr
If 2/ I have seen people bring up very specific issues on not finding specific values (not the same as yours but similar) and the issue was solved by installing PyTorch because some models only exist as PyTorch models. You can see the full response from #YokoHono here: Transformers model from Hugging-Face throws error that specific classes couldn t be loaded

Invalid date:Error while import CSV to Cassandra using pySpark

I'm using Jupyter NoteBook to run pySpark code to import CSV file to Cassandra v3.11.3. Getting below error.
... 1 more[![enter image description here][1]][1]
---------------------------------------------------------------------------
pySpark Code i have attached as picture:
[![pyspark_code][1]][1]
Any inputs...
Without the full trace it's hard to know exactly where this is failing. The method you pasted is just the p4yj wrapper method and we really would need to see the underlying Java Exception.
From what I can tell it looks like you are attempting to also use some options on the C* write that are unsupported. For example "MODE" - "DROPMALFORMED" is not a valid C* connector option. DataFrame Writer and Reader options are source specific so you are unfortunately unable to mix and match.
This makes me think that the data being written actually has a malformed date string or two and this code is dying when attempting to write the broken record. One way around this would be to attempt to do the date casting on CSV read which I believe does support DROPMALFORMED style parsing options.

Google Colab - downloads some files, TypeError: Failed to fetch on others

I have a Google Colab notebook with PyTorch code running in it.
At the beginning of the train function, I create, save and download word_to_ix and tag_to_ix dictionaries without a problem, using the following code:
from google.colab import files
torch.save(tag_to_ix, pos_dict_path)
files.download(pos_dict_path)
torch.save(word_to_ix, word_dict_path)
files.download(word_dict_path)
I train the model, and then try to download it with the code:
torch.save(model.state_dict(), model_path)
files.download(model_path)
Then I get a MessageError: TypeError: Failed to fetch.
Obviously, the problem is not with the third party cookies (as suggested here), because the first files are downloaded without a problem. (I actually also tried adding the link in my Allow section, but, surprise surprise, it made no difference.)
I was originally trying to save the model as is (which, to my understanding, saves it as a Pickle), and I thought maybe Colab files doesn't handle downloading Pickles well, but as you can see above, I'm now trying to save a dict object (which is also what word_to_ix and tag_to_ix) are, and it's still not working.
Downloading the file manually with right-click isn't a solution, because sometimes I leave the code running while I do other things, and by the time I get back to it, the runtime has disconnected, and the files are gone.
Any suggestions?

error uploading csv file on cloud jupyter notebook

I have set up a google cloud account
I want to perform my deep learning much more faster on a jupyter notebook, but
I cannot find a way to read my csv file
I downloaded it with wget from my github account and afterwards I tried
dataset = pd.read_csv('/home/user/.jupyter/SIEMENSTRAIN.csv')
but I get the following error
pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 12
Why? When I read it on my laptop using my jupyter notebooks, everything runs well
Any suggestions?
I tried the recommended solutions for this error and I got the next warning
/home/user/anaconda3/lib/python3.5/site-packages/ipykernel/main.py:1: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators; you can avoid this warning by specifying engine='python'.
if name == 'main':
When I ran dataset.head() this is what appeared
Any help please?
There are a number of possibilities that could be causing the problem... I would first always make sure that Pandas (pd)'s version is updated and compatible.
The more likely cause is that the CSV itself is not right, so pd.read_csv() is not able to work correctly (thus a Parse Error). This may have something to do with the headers, though I'm not sure what your original CSV file looks like. It's worth playing around with read_csv, for example:
df = pandas.read_csv(fileName, sep='delimiter', header=None)
This tampers with 2 things - the delimiter, and if pd is reading a header from CSV or not.
I go through some pd.read_csv() stuff in my book about Stock Prediction (another cool Machine Learning problem) and Deep Learning, feel free to check it out.
Good Luck!
I tried what you proposed and this is what I got
So, any suggestions?
I suppose that the path is ok, but it just won't be read properly, or am I wrong?

Import Ecoinvent 2.2 Ecospold files into Brightway

I was trying to use the following code (Figure.1) to import Ecoinvent v2.2 into Brightway. I followed code from: https://github.com/PoutineAndRosti/Brightway-Seminar-2017/blob/master/Day%201%20AM/2%20-%20BW%20structure%20and%20first%20LCAs.ipynb
I obtained all XML (ecospold files) downloaded from Simapro (which is connected to ecoinvent database) and save all datafiles into the folder: C:\bw2-python\ecoSpold1.
However, when I run the next step, I ran into the following errors:
Figure 2
I am not sure what is wrong here. Any suggestion would be very helpful!
I think the ecospold files obtained from simapro and from the ecoinvent website are not the same. Simapro codes things a bit differently, which I think affects the naming of exchanges (that is why you got an invalid exchange). You either download the ecospold files from ecoinvent or use the tools see notebooks here and here to read exported csv files (the format prefered by simapro to export datasets).

Resources