How can I know if pickle.dump() successfully saved the file - python-3.x

How can I know if pickle.dump() successfully saved the pickle file?
In the docs, I do not see a return value that indicates success or failure.
I'm working with python 3, and currently in a jupyter notebook

If pickle.dump or pickle.dumps fails an error will be thrown. See the docs further down for what can and can't be pickled. You may also get an OSError (link) if some lower level system call fails
Note however that even if pickle.dump does not throw an error, you may still not be able to load the pickled data. It may for instance be the case that an object you pickled uses an import or references a function that was defined in the context of the pickling code, say a Jupyter notebook defines a custom function which is referenced by the pickled object. If you now ship that pickled object file to another machine it won't see the function that is referenced in the object and the unpickling will fail.
Similarly if there's an API change in a module that the picled object depends on, the import paths may have changed and the unpickling will again fail.
You may also want to have a look at dill which covers slightly more cases than pickle https://github.com/uqfoundation/dill

Related

from sparkdl import DeepImageFeaturizer

I need to use spark in transfer learning to train images ,the error is:
"nnot import name 'resnet50' from 'keras.applications' (/usr/local/lib/python3.7/dist-packages/keras/applications/init.py) "
i try to solve this question since one week, this one is coming from sparkdl, if you add to this file (sparkdl/transformers/keras_applications.py)
**
from tensorflow.keras.applications
**, it will be return normal, but this time you will see another error like
AttributeError: module 'tensorflow' has no attribute 'Session'
i tried on different IDE (Pycharm, Vs Code) but i got the same errors. there are different explications on Stackoverflow. but i'm totally confused now

How can I use Brightway2 with US LCI database?

Short version:
I am trying to upload US LCI database to Brightway2 and I am failing miserably. Has anyone succeeded? If so, could you share it with me? :D
Long version:
I am following the notebook IO - Importing the US LCI database notebook and I am having a lot of problems. I am aware that, as the notebook indicates, it is a work in progress. Anyhow, I wanted to give it a try:
I tried uploading every ecospold version database found here, following the method from the notebook. The only one that gave me a similar results was version FY20.Q3.02. However, right off the bat I get the following differences/errors:
Same as the notebook, I get this error: Couldn't apply strategy link_technosphere_by_activity_hash: Object in source database can't be uniquely linked to target database. And two activities that are linked. When I follow the instructions of ignoring these datasets, it throws me that error over and over again.
Trying to move on with the tutorial, I get more errors and at the end I end up with all exchanges unlinked:
633 datasets
37513 exchanges
37505 unlinked exchanges
Finally, after running the code in line [15]:
import functools
f = functools.partial(link_iterable_by_fields,
other=Database(config.biosphere),
kind='biosphere'
)
sp.apply_strategy(f)
sp.statistics(f)
I end up with:
0 datasets
0 exchanges
0 unlinked exchanges
Which is hilarious and sad at the same time. Since I am new with Python and BW, my troubleshooting is clumpsy and probably erroneous (I promise I googled a lot and went through the code). And concluded I am failing and it is time to ask questions:
Has anybody succeeded uploading the US LCI database to Brightway2?
If so, how? Which file did you use?
Thank you!!!!
This is an excellent question. I have added text to the offending notebook to note that it is obsolete.
In general, I think trying to import the ecospold files is a fools errand, as though they are labeled ecospold2, they are actually ecospold1 (which is a totally different format):
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ecoSpold xmlns="http://www.EcoInvent.org/EcoSpold01">
The most recent export also raises an error when I try the ecospold1 importer:
AttributeError: no such child: {http://www.EcoInvent.org/EcoSpold01}modellingAndValidation
This is a required attribute in ecospold1.
I think the best way forward would be to consume the JSON-LD directly. Note that it is important not to run bw2setup(), as you would also want to use their list of elementary flows and LCIA methods. Currently the experimental JSON-LD importer fails because the provided datasets need allocation, but don't provide a set of consistent allocation methods. When I use the git checkout of bw2io and do the following:
uslci = JSONLDImporter(
"/Users/cmutel/Downloads/National_Renewable_Energy_Laboratory-USLCI_Database/",
"US LCI",
preferred_allocation="CAUSAL_ALLOCATION"
)
uslci.apply_strategies()
I get the following error:
UnallocatableDataset: We currently only support exchange-specific CAUSAL_ALLOCATION
This is fixable, but someone would need to step through this and fix the allocation procedure, and I don't have the time to do that now.

NameError: name 'log10' is not defined in function called in script

Why log10() is failing to be recognized when called within a function definition in another script? I'm running Python3 in Anaconda (Jupyter and Spyder).
I've had success with log10() in Jupyter (oddly without even calling "import math"). I've had success with defining functions in a .py file and calling those functions within a separate script. I should be able to perform a simple log10.
I created a new function (in Spyder) and saved it in a file "test_log10.py":
def test_log10(input):
import math
return math.log10(input)
In a separate script (Jupyter notebook) I run :
import test_log10
test_log10.test_log10(10)
I get the following error:
"NameError: name 'log10' is not defined"
What am I missing?
Since I'm not using the environment of Jupyther and alike, I don't know how to correct it in these system, perhaps there is some configuration file over there,check the documentation.
But exactly on the issue, when this happens its because python has not "linked" well something at the import, so I suggest a workaround with the libs in the next way:
import numpy as np
import math
and when you are using functions from math, simply add the np. before, i.e.:
return math.log10(input)
to
return np.math.log10(input)
Exactly I don't know why the mismatch, but this worked for me.

Google Colab - downloads some files, TypeError: Failed to fetch on others

I have a Google Colab notebook with PyTorch code running in it.
At the beginning of the train function, I create, save and download word_to_ix and tag_to_ix dictionaries without a problem, using the following code:
from google.colab import files
torch.save(tag_to_ix, pos_dict_path)
files.download(pos_dict_path)
torch.save(word_to_ix, word_dict_path)
files.download(word_dict_path)
I train the model, and then try to download it with the code:
torch.save(model.state_dict(), model_path)
files.download(model_path)
Then I get a MessageError: TypeError: Failed to fetch.
Obviously, the problem is not with the third party cookies (as suggested here), because the first files are downloaded without a problem. (I actually also tried adding the link in my Allow section, but, surprise surprise, it made no difference.)
I was originally trying to save the model as is (which, to my understanding, saves it as a Pickle), and I thought maybe Colab files doesn't handle downloading Pickles well, but as you can see above, I'm now trying to save a dict object (which is also what word_to_ix and tag_to_ix) are, and it's still not working.
Downloading the file manually with right-click isn't a solution, because sometimes I leave the code running while I do other things, and by the time I get back to it, the runtime has disconnected, and the files are gone.
Any suggestions?

Error handling netCDF file in Python

I am extracting data from netCDF files with Python code. I need to check if the netCDF files are in agreement with the CORDEX standards (CORDEX is a coordinated effort to carry modelling experiments with regional climate models). For this I need to access an attribute of the netCDF file. If the attribute is not found, then the code should go to the next file.
A snipet of my code is as follows:
import netCDF4
cdf_dataset = netCDF4.Dataset(file_2read)
try:
cdf_domain = cdf_dataset.CORDEX_domain
print(cdf_domain)
except:
print('No CORDEX domain found. Will exit')
....some more code....
When the attribute "CORDEX_domain" is available everything is fine. If the attribute is not available then the following exception is raised.
AttributeError: NetCDF: Attribute not found
This is a third party exception, which I would say should be handled as a general one, but it is not, as I am not able to get my "print" inside the "except" statement to work or anything else for that matter. Can anyone point me the way to handle this? Thanks.
There is no need for a try/except block; netCDF4.Dataset has a method ncattrs which returns all global attributes, you can test if the required attribute is in there. For example:
if 'CORDEX_domain' in cdf_dataset.ncattrs():
do_something()
You can do the same to test if (for example) a required variable is present:
if 'some_var_name' in cdf_dataset.variables:
do_something_else()
p.s.: "catch alls" are usually a bad idea..., e.g. Python: about catching ANY exception
EDIT:
You can do the same for variable attributes, e.g.:
var = cdf_dataset.variables['some_var_name']
if 'some_attribute' in var.ncattrs():
do_something_completely_else()

Resources