ValueError: negative dimensions are not allowed when loading .pkl file - python-3.x

Although there are many question threads for error ValueError: negative dimensions are not allowed
I couldn't find the answer for my problem
After training Machine learning model using SGDclassifer
clf=linear_model.SGDClassifier(loss='log',random_state=20000,verbose=1,class_weight='balanced')
model=clf.fit(X,Y)
Dimension of X is (1651880,246177)
The below code is working i.e when saving model object and when using model for prediction
joblib.dump(model, 'trainedmodel.pkl',compress=3)
prediction_result=model.predict(x_test)
but getting error when loading the saved model
model = joblib.load('trainedmodel.pkl')
below is the error message
Please help me out to resolve it.
File "C:\Users\Taxonomy\AppData\Roaming\Python\Python36\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 598, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "C:\Users\Taxonomy\AppData\Roaming\Python\Python36\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 526, in _unpickle
obj = unpickler.load()
File "C:\Users\Taxonomy\Anaconda3\lib\pickle.py", line 1050, in load
dispatch[key[0]](self)
File "C:\Users\Taxonomy\AppData\Roaming\Python\Python36\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 352, in load_build
self.stack.append(array_wrapper.read(self))
File "C:\Users\Taxonomy\AppData\Roaming\Python\Python36\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 195, in read
array = self.read_array(unpickler)
File "C:\Users\Taxonomy\AppData\Roaming\Python\Python36\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 141, in read_array
array = unpickler.np.empty(count, dtype=self.dtype)
ValueError: negative dimensions are not allowed

Try to dump model with protocol 4.
from python's pickle docs:
Protocol version 4 was added in Python 3.4. It adds support for very
large objects, pickling more kinds of objects, and some data format
optimizations. Refer to PEP 3154 for information about improvements
brought by protocol 4.

Related

TensorFlow 2.1 using TPUEstimator: RuntimeError: All tensors outfed from TPU should preserve batch size dimension, but got scalar Tensor

I just converted an existing project from TF 1.14 to TF 2.1 which uses the TPUEstimator API. After making the conversion, testing locally (i.e. use_tpu=False) runs successfully. However, I am getting errors when running on Google Cloud TPU (i.e. use_tpu=True).
Note: This is in the context of the AdaNet AutoML framework (v0.8.0), although I suspect this may be a general TPUEstimator-related error, as the errors appear to originate in the tpu_estimator.py and error_handling.py scripts seen in the Traceback below:
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3032, in train
rendezvous.record_error('training_loop', sys.exc_info())
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 81, in record_error
if value and value.op and value.op.type == _CHECK_NUMERIC_OP_NAME:
AttributeError: 'RuntimeError' object has no attribute 'op'
During handling of the above exception, another exception occurred:
File "workspace/trainer/train.py", line 331, in <module>
main(args=parsed_args)
File "workspace/trainer/train.py", line 177, in main
run_config=run_config)
File "workspace/trainer/train.py", line 68, in run_experiment
estimator.train(input_fn=train_input_fn, max_steps=total_train_steps)
File "/usr/local/lib/python3.6/site-packages/adanet/core/estimator.py", line 853, in train
saving_listeners=saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train
rendezvous.raise_errors()
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 143, in raise_errors
six.reraise(typ, value, traceback)
File "/usr/local/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
saving_listeners=saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1194, in _train_model_default
features, labels, ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2857, in _call_model_fn
config)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1152, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3186, in _model_fn
host_ops = host_call.create_tpu_hostcall()
File "/usr/local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2226, in create_tpu_hostcall
'dimension, but got scalar {}'.format(dequeue_ops[i][0]))
RuntimeError: All tensors outfed from TPU should preserve batch size dimension, but got scalar Tensor("OutfeedDequeueTuple:1", shape=(), dtype=int64, device=/job:tpu_worker/task:0/device:CPU:0)'
The previous version of the project using TF 1.14 runs both locally and on TPU using TPUEstimator without issues. Is there something obvious I am potentially missing for the conversion over to TF 2.1 when using TPUEstimator API?
Have you applied the following:
dataset = ...
dataset = dataset.apply(tf.contrib.data.batch_and_drop_remainder(batch_size))
this potentially drops the last few samples from a file to ensure that every batch has a static shape of batch_size, which is required when training on TPUs.

Can I use dictionary in keras customized model?

I recently read a paper about UNet++,and I want to implement this structure with tensorflow-2.0 and keras customized model. As the structure is so complicated, I decided to manage the keras layers by a dictionary. Everything went well in training, but an error occurred while saving the model. Here is a minimum code to show the error:
class DicModel(tf.keras.Model):
def __init__(self):
super(DicModel, self).__init__(name='SequenceEECNN')
self.c = {}
self.c[0] = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3,activation='relu',padding='same'),
tf.keras.layers.BatchNormalization()]
)
self.c[1] = tf.keras.layers.Conv2D(3,3,activation='softmax',padding='same')
def call(self,images):
x = self.c[0](images)
x = self.c[1](x)
return x
X_train,y_train = load_data()
X_test,y_test = load_data()
class_weight.compute_class_weight('balanced',np.ravel(np.unique(y_train)),np.ravel(y_train))
model = DicModel()
model_name = 'test'
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir='logs/'+model_name+'/')
early_stop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss',patience=100,mode='min')
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
loss=tf.keras.losses.sparse_categorical_crossentropy,
metrics=['accuracy'])
results = model.fit(X_train,y_train,batch_size=4,epochs=5,validation_data=(X_test,y_test),
callbacks=[tensorboard_callback,early_stop_callback],
class_weight=[0.2,2.0,100.0])
model.save_weights('model/'+model_name,save_format='tf')
The error information is:
Traceback (most recent call last):
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/learn_tf2/test_model.py", line 61, in \<module>
model.save_weights('model/'+model_name,save_format='tf')
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 1328, in save_weights
self.\_trackable_saver.save(filepath, session=session)
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 1106, in save
file_prefix=file_prefix_tensor, object_graph_tensor=object_graph_tensor)
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 1046, in \_save_cached_when_graph_building
object_graph_tensor=object_graph_tensor)
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 1014, in \_gather_saveables
feed_additions) = self.\_graph_view.serialize_object_graph()
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/graph_view.py", line 379, in serialize_object_graph
trackable_objects, path_to_root = self.\_breadth_first_traversal()
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/graph_view.py", line 199, in \_breadth_first_traversal
for name, dependency in self.list_dependencies(current_trackable):
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/graph_view.py", line 159, in list_dependencies
return obj.\_checkpoint_dependencies
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/data_structures.py", line 690, in \_\_getattribute\_\_
return object.\_\_getattribute\_\_(self, name)
File "/media/xrzhang/Data/ZHS/Research/CNN-TF2/venv/lib/python3.6/site-packages/tensorflow/python/training/tracking/data_structures.py", line 732, in \_checkpoint_dependencies
"ignored." % (self,))
ValueError: Unable to save the object {0: \<tensorflow.python.keras.engine.sequential.Sequential object at 0x7fb5c6c36588>, 1: \<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7fb5c6c36630>} (a dictionary wrapper constructed automatically on attribute assignment). The wrapped dictionary contains a non-string key which maps to a trackable object or mutable data structure.
If you don't need this dictionary checkpointed, wrap it in a tf.contrib.checkpoint.NoDependency object; it will be automatically un-wrapped and subsequently ignored.
The tf.contrib.checkpoint.NoDependency seems has been removed from Tensorflow-2.0 (https://medium.com/tensorflow/whats-coming-in-tensorflow-2-0-d3663832e9b8). How can I fix this issue? Or should I just give up using dictionary in customized Keras Model. Thank you for your time and helps!
Use string keys. For some reason tensorflow doesn't like int keys.
The exception message was incorrect in Tensorflow 2.0 and has been fixed in 2.2
You can avoid the problem by wrapping the c attribute like this
from tensorflow.python.training.tracking.data_structures import NoDependency
self.c = NoDependency({})
For more details check this issue.

I am getting error "Restoring from checkpoint failed." while training tensorflow estimator api on AI-platform(ml-engine)

I am trying to do hyperparameter tuning on ai-engine for DNN regressor using tensorflow estimator api. But after submitting the job, it shows that job is failed and I get this error in job details.
Can someone help?
Hyperparameter Tuning Trial #1 Failed before any other successful trials were completed. The failed trial had parameters: learning_rate=0.0019937718716419557, num-layers=2, first-layer-size=148, scale-factor=0.7910729020312588, . The trial's error message was: The replica master 0 exited with a non-zero status of 1.
Traceback (most recent call last):
[...]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 507, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 385, in _AddShardedRestoreOps
name="restore_shard"))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps
restore_sequentially)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
tensor_name = dnn/hiddenlayer_0/bias; shape in shape_and_slice spec [148] does not match the shape stored in checkpoint: [117]
[[node save/RestoreV2_1 (defined at /usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1403) ]]
Looks like you are using the same output directory for all the trials, and so trial#1 is trying to read trial#2 checkpoint (perhaps because it is the latest one in the directory) and failing because the architecture is different
Make sure to use a different output directory for each hyperparam training run. There are two ways you do this:
Use the --job-dir as the output directory.
Append a hyperparam trial number to the output directory you are using now:
output_dir = os.path.join( output_dir, json.loads( os.environ.get('TF_CONFIG', '{}') ).get('task', {}).get('trial', '') )

How do I get word2vec to load a string? problem:'dict' object has no attribute '_load_specials'

I have a problem when using word2vec and lstm, the code is:
def input_transform(string):
words=jieba.lcut(string)
words=np.array(words).reshape(1,-1)
model=Word2Vec.load('lstm_datamodel.pkl')
combined=create_dictionaries(model,words)
return combined
def lstm_predict(string):
print ('loading model......')
with open('lstm_data.yml', 'r') as f:
yaml_string = yaml.load(f)
model = model_from_yaml(yaml_string)
print ('loading weights......')
model.load_weights('lstm_data.h5')
model.compile(loss='binary_crossentropy',
optimizer='adam',metrics=['accuracy'])
data=input_transform(string)
data.reshape(1,-1)
#print data
result=model.predict_classes(data)
if result[0][0]==1:
print (string,' positive')
else:
print (string,' negative')
and the error is:
Traceback (most recent call last):
File "C:\Python36\lib\site-packages\gensim\models\word2vec.py", line 1312, in load
model = super(Word2Vec, cls).load(*args, **kwargs)
File "C:\Python36\lib\site-packages\gensim\models\base_any2vec.py", line 1244, in load
model = super(BaseWordEmbeddingsModel, cls).load(*args, **kwargs)
File "C:\Python36\lib\site-packages\gensim\models\base_any2vec.py", line 603, in load
return super(BaseAny2VecModel, cls).load(fname_or_handle, **kwargs)
File "C:\Python36\lib\site-packages\gensim\utils.py", line 423, in load
obj._load_specials(fname, mmap, compress, subname)
AttributeError: 'dict' object has no attribute '_load_specials'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/GitHub/reviewsentiment/veclstm.py", line 211, in <module>
lstm_predict(string)
File "C:/GitHub/reviewsentiment/veclstm.py", line 191, in lstm_predict
data=input_transform(string)
File "C:/GitHub/reviewsentiment/veclstm.py", line 177, in input_transform
model=Word2Vec.load('lstm_datamodel.pkl')
File "C:\Python36\lib\site-packages\gensim\models\word2vec.py", line 1323, in load
return load_old_word2vec(*args, **kwargs)
File "C:\Python36\lib\site-packages\gensim\models\deprecated\word2vec.py", line 153, in load_old_word2vec
old_model = Word2Vec.load(*args, **kwargs)
File "C:\Python36\lib\site-packages\gensim\models\deprecated\word2vec.py", line 1618, in load
model = super(Word2Vec, cls).load(*args, **kwargs)
File "C:\Python36\lib\site-packages\gensim\models\deprecated\old_saveload.py", line 88, in load
obj._load_specials(fname, mmap, compress, subname)
AttributeError: 'dict' object has no attribute '_load_specials'enter code here
I am sorry for including so much code.
This is my first time to ask on StackOverflow, and I have tried my very best to find the answer on my own, but failed. So can you help me? Thank you very much!
The error is occurring on the line...
model=Word2Vec.load('lstm_datamodel.pkl')
...so all the other/later code you've supplied is irrelevant and superfluous.
The suffix of your filename, lstm_datamodel.pkl, suggests it may have been created via Python's pickle() facility. The gensim Word2Vec.load() method only expects to load models that were saved by the module's own save() routine, not any pickled object.
The gensim native save() does make use of pickle for some of its saving, but not all, and thus wouldn't expect a fully-pickled object in the file provided.
This might be cause of your problem. You could try instead a load based entirely on Python pickle:
model = pickle.load('lstm_datamodel.pkl')
Alternatively, if you can reconstruct the model in the file, but be sure to save it via the native gensim model.save(filename), that might also resolve the problem.

TensorFlow reshaping with Conv1D

I have seen problems similar to mine here on Stack Overflow, but not exactly the same. I can reshape when using fully-connected NN layers, but not with Conv1D layers. Here's a minimal example. I'm using TF 1.4.0 on Python 3.6.3.
import tensorflow as tf
# fully connected
fc = tf.placeholder(tf.float32, [None,12])
fc = tf.contrib.layers.fully_connected(fc, 12)
fc = tf.contrib.layers.fully_connected(fc, 6)
fc = tf.reshape(fc, [-1,3,2])
# convolutional
con = tf.placeholder(tf.float32, [None,50,4])
con = tf.layers.Conv1D(con, 12, 3, activation=tf.nn.relu)
con = tf.layers.Conv1D(con, 6, 3, activation=tf.nn.relu)
con = tf.reshape(con, [-1,50,3,2])
Here is the output (yes, I'm aware of the RuntimeWarning. The messages I have found which discuss it suggest that it's harmless, but if you know otherwise, please share!):
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_util.py", line 468, in make_tensor_proto
str_values = [compat.as_bytes(x) for x in proto_values]
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_util.py", line 468, in <listcomp>
str_values = [compat.as_bytes(x) for x in proto_values]
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/compat.py", line 65, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got <tensorflow.python.layers.convolutional.Conv1D object at 0x7fa67e0d1a20>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "minimal reshape example.py", line 16, in <module>
con = tf.reshape(con, [-1,width,3,2])
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3938, in reshape
"Reshape", tensor=tensor, shape=shape, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 513, in _apply_op_helper
raise err
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 510, in _apply_op_helper
preferred_dtype=default_dtype)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 926, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 229, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_util.py", line 472, in make_tensor_proto
"supported type." % (type(values), values))
TypeError: Failed to convert object of type <class 'tensorflow.python.layers.convolutional.Conv1D'> to Tensor. Contents: <tensorflow.python.layers.convolutional.Conv1D object at 0x7fa67e0d1a20>. Consider casting elements to a supported type.
My code fails at con = tf.reshape(con, [-1,50,3,2]). Yet the pattern is nearly identical to the pattern that I use for the fully-connected graph, fc.
I made nets very similar to these work in the higher-level API for TensorFlow called TFLearn. However, TFLearn's DNN Estimator object is having trouble managing a tf.Session correctly. After over a month, I have yet to resolve the issue with TFLearn's developers on GitHub.
I don't mind using TensorFlow's native Estimator, but I have to solve this reshape problem to achieve it.
Well, I found the error: tf.layers.Conv1D != tf.layers.conv1d. Changing the former to the latter eliminated the error. Let the TensorFlow / Python user beware!
Even though TensorFlow seems to avoid Python's object model (which is probably necessary, given the possibility of distributed, low-level computation), there are in fact a few genuine classes in the Python API. The class constructors can accept many (all?) of the same arguments as the similarly-named convenience functions.

Resources