Restoring saved model in Tensorflow v1.14

Restoring saved model in Tensorflow v1.14 - python-3.x

I am using tensorflow v1.14. I have a saved model and I'm trying to restore the model using the following code:
loader = tf.train.import_meta_graph("models/fcnn0/model.ckpt.meta")
graph = tf.get_default_graph()
sess = tf.Session()
loader.restore(sess, "models/fcnn0/model.ckpt")
I used to use the same piece of code in Tensorflow v1.13 and it used to work without errors. But now I'm getting the error
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.DataLossError: file is too short to be an sstable
[[{{node save/RestoreV2}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sandesh/PycharmProjects/fading/finding_code/src/load_32_64.py", line 8, in <module>
loader.restore(sess, "models/fcnn_32_64_aenc_1331_747_3870000/model.ckpt")
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1286, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.DataLossError: file is too short to be an sstable
[[node save/RestoreV2 (defined at home/sandesh/PycharmProjects/fading/finding_code/src/load_32_64.py:5) ]]
Original stack trace for 'save/RestoreV2':
File "home/sandesh/PycharmProjects/fading/finding_code/src/load_32_64.py", line 5, in <module>
loader = tf.train.import_meta_graph("models/fcnn_32_64_aenc0/model.ckpt.meta")
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1449, in import_meta_graph
**kwargs)[0]
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py", line 1473, in _import_meta_graph_with_return_elements
**kwargs))
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/meta_graph.py", line 857, in import_scoped_meta_graph_with_return_elements
return_elements=return_elements)
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py", line 443, in import_graph_def
_ProcessNewOps(graph)
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py", line 236, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3751, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3751, in <listcomp>
for c_op in c_api_util.new_tf_operations(self)
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3641, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
Can someone point me as to what I'm doing wrong? Thanks in advance.

I was looking into the folder where the model files were saved and found that the model.ckpt.meta file had not been written to disk properly. I reran the training and saved the model and then it worked perfectly.

Related

output and feeb_dict inside session FailedPreconditionError (see above for traceback): Attempting to use uninitialized value

I am converting the MTCNN tensorflow into tensorflow tensorRT
When I run camera_test.py
I get this error FailedPreconditionError: Attempting to use uninitialized in Tensorflow
Traceback (most recent call last): File
"/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 1334, in _do_call
return fn(*args) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 1407, in _call_tf_sessionrun
run_metadata) tensorflow.python.framework.errors_impl.FailedPreconditionError:
Attempting to use uninitialized value conv4_2/biases [[{{node
conv4_2/biases/read}}]] [[{{node Squeeze_1}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "camera_test_trrt.py", line
48, in
boxes_c,landmarks = mtcnn_detector.detect(image) File "../Detection/MtcnnDetector.py", line 371, in detect
boxes, boxes_c, _ = self.detect_pnet(img) File "../Detection/MtcnnDetector.py", line 221, in detect_pnet
cls_cls_map, reg = self.pnet_detector.predict(im_resized) File "../Detection/fcn_detector_trrt.py", line 56, in predict
self.height_op: height}) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 929, in run
run_metadata_ptr) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 1152, in _run
feed_dict_tensor, options, run_metadata) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 1328, in _do_run
run_metadata) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/client/session.py",
line 1348, in _do_call
raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.FailedPreconditionError:
Attempting to use uninitialized value conv4_2/biases [[node
conv4_2/biases/read (defined at ../train_models/mtcnn_model.py:208) ]]
[[node Squeeze_1 (defined at ../train_models/mtcnn_model.py:245) ]]
Caused by op 'conv4_2/biases/read', defined at: File
"camera_test_trrt.py", line 23, in
PNet = FcnDetector(P_Net, '/home/jetsonnano/Downloads/MTCNN-Tensorflow-master/test/p_output_graph_FP16.pb')
File "../Detection/fcn_detector_trrt.py", line 23, in init
self.cls_prob, self.bbox_pred, _ = net_factory(image_reshape, training=False) File "../train_models/mtcnn_model.py", line 208, in
P_Net
bbox_pred = slim.conv2d(net,num_outputs=4,kernel_size=[1,1],stride=1,scope='conv4_2',activation_fn=None)
File
"/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py",
line 182, in func_with_args
return func(*args, **current_args) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py",
line 1158, in convolution2d
conv_dims=2) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py",
line 182, in func_with_args
return func(*args, **current_args) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py",
line 1061, in convolution
outputs = layer.apply(inputs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py",
line 1227, in apply
return self.call(inputs, *args, **kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/layers/base.py",
line 530, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs) File
"/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py",
line 538, in call
self._maybe_build(inputs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py",
line 1603, in _maybe_build
self.build(input_shapes) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py",
line 174, in build
dtype=self.dtype) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/layers/base.py",
line 435, in add_weight
getter=vs.get_variable) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py",
line 349, in add_weight
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py",
line 607, in _add_variable_with_custom_getter
**kwargs_for_getter) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py",
line 1479, in get_variable
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py",
line 1220, in get_variable
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py",
line 530, in get_variable
return custom_getter(**custom_getter_kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py",
line 1753, in layer_variable_getter
return _model_variable_getter(getter, *args, **kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py",
line 1744, in _model_variable_getter
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py",
line 182, in func_with_args
return func(*args, **current_args) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py",
line 350, in model_variable
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py",
line 182, in func_with_args
return func(*args, **current_args) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py",
line 277, in variable
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py",
line 499, in _true_getter
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py",
line 911, in _get_single_variable
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variables.py",
line 213, in call
return cls._variable_v1_call(*args, **kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variables.py",
line 176, in _variable_v1_call
aggregation=aggregation) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variables.py",
line 155, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py",
line 2495, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variables.py",
line 217, in call
return super(VariableMetaclass, cls).call(*args, **kwargs) File
"/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variables.py",
line 1395, in init
constraint=constraint) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/variables.py",
line 1557, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read") File
"/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py",
line 180, in wrapper
return target(*args, **kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py",
line 81, in identity
ret = gen_array_ops.identity(input, name=name) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py",
line 3890, in identity
"Identity", input=input, name=name) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py",
line 788, in _apply_op_helper
op_def=op_def) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py",
line 507, in new_func
return func(*args, **kwargs) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/framework/ops.py",
line 3300, in create_op
op_def=op_def) File "/home/jetsonnano/.virtualenvs/jetsonnanotest/lib/python3.6/site-packages/tensorflow/python/framework/ops.py",
line 1801, in init
self._traceback = tf_stack.extract_stack()
FailedPreconditionError (see above for traceback): Attempting to use
uninitialized value conv4_2/biases [[node conv4_2/biases/read
(defined at ../train_models/mtcnn_model.py:208) ]] [[node Squeeze_1
(defined at ../train_models/mtcnn_model.py:245) ]]
how do i tf.global_variables_initializer will sess.run
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)
When I have output parameters and feed_dict in sess.run
cls_prob, bbox_pred,landmark_pred = self.sess.run([self.cls_prob, self.bbox_pred,self.landmark_pred], feed_dict={self.image_op: data})
in detector.py
and
cls_prob, bbox_pred = self.sess.run([self.cls_prob, self.bbox_pred],feed_dict={self.image_op: databatch, self.width_op: width,self.height_op: height})
in fcn_detector.py
can anyone help out here?

Just after the following line
self.sess = tf.Session( config=tf.ConfigProto(allow_soft_placement=True, gpu_options=tf.GPUOptions(allow_growth=True)))
declare
init_op = tf.global_variables_initializer()
and do
self.sess.run(init_op)

Odoo 11 throws error on customer/partner listing after upgrade from odoo 10

I have recently upgraded an odoo 10 instance to odoo 11 using an open upgrade. After deleting projects module (apparently it cannot be properly migrated) and fixing some custom module I finally do have a running odoo 11 instance. However, whenever I try to click on customers/vendors odoo throws this error:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 948, in __get__
value = record.env.cache.get(record, self)
File "/usr/lib/python3/dist-packages/odoo/api.py", line 977, in get
value = self._data[key][field][record._ids[0]]
KeyError: 421
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 948, in __get__
value = record.env.cache.get(record, self)
File "/usr/lib/python3/dist-packages/odoo/api.py", line 977, in get
value = self._data[key][field][record._ids[0]]
KeyError: 421
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/odoo/http.py", line 653, in _handle_exception
return super(JsonRequest, self)._handle_exception(exception)
File "/usr/lib/python3/dist-packages/odoo/http.py", line 312, in _handle_exception
raise pycompat.reraise(type(exception), exception, sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/odoo/tools/pycompat.py", line 87, in reraise
raise value
File "/usr/lib/python3/dist-packages/odoo/http.py", line 695, in dispatch
result = self._call_function(**self.params)
File "/usr/lib/python3/dist-packages/odoo/http.py", line 344, in _call_function
return checked_call(self.db, *args, **kwargs)
File "/usr/lib/python3/dist-packages/odoo/service/model.py", line 97, in wrapper
return f(dbname, *args, **kwargs)
File "/usr/lib/python3/dist-packages/odoo/http.py", line 337, in checked_call
result = self.endpoint(*a, **kw)
File "/usr/lib/python3/dist-packages/odoo/http.py", line 939, in __call__
return self.method(*args, **kw)
File "/usr/lib/python3/dist-packages/odoo/http.py", line 517, in response_wrap
response = f(*args, **kw)
File "/usr/lib/python3/dist-packages/odoo/addons/web/controllers/main.py", line 934, in call_kw
return self._call_kw(model, method, args, kwargs)
File "/usr/lib/python3/dist-packages/odoo/addons/web/controllers/main.py", line 926, in _call_kw
return call_kw(request.env[model], method, args, kwargs)
File "/usr/lib/python3/dist-packages/odoo/api.py", line 697, in call_kw
return call_kw_model(method, model, args, kwargs)
File "/usr/lib/python3/dist-packages/odoo/api.py", line 682, in call_kw_model
result = method(recs, *args, **kwargs)
File "/usr/lib/python3/dist-packages/odoo/models.py", line 1296, in load_views
for [v_id, v_type] in views
File "/usr/lib/python3/dist-packages/odoo/models.py", line 1296, in <dictcomp>
for [v_id, v_type] in views
File "/usr/lib/python3/dist-packages/odoo/addons/mail/models/mail_thread.py", line 374, in fields_view_get
res = super(MailThread, self).fields_view_get(view_id=view_id, view_type=view_type, toolbar=toolbar, submenu=submenu)
File "/usr/lib/python3/dist-packages/odoo/models.py", line 1375, in fields_view_get
result = self._fields_view_get(view_id=view_id, view_type=view_type, toolbar=toolbar, submenu=submenu)
File "/usr/lib/python3/dist-packages/odoo/addons/base/res/res_partner.py", line 318, in _fields_view_get
res = super(Partner, self)._fields_view_get(view_id=view_id, view_type=view_type, toolbar=toolbar, submenu=submenu)
File "/usr/lib/python3/dist-packages/odoo/models.py", line 1338, in _fields_view_get
root_view = View.browse(view_id).read_combined(['id', 'name', 'field_parent', 'type', 'model', 'arch'])
File "/usr/lib/python3/dist-packages/odoo/addons/base/ir/ir_ui_view.py", line 730, in read_combined
arch = self.apply_view_inheritance(arch_tree, root.id, self.model)
File "/usr/lib/python3/dist-packages/odoo/addons/base/ir/ir_ui_view.py", line 674, in apply_view_inheritance
sql_inherit = self.get_inheriting_views_arch(source_id, model)
File "/usr/lib/python3/dist-packages/odoo/addons/base/ir/ir_ui_view.py", line 498, in get_inheriting_views_arch
for view in views.sudo()
File "/usr/lib/python3/dist-packages/odoo/addons/base/ir/ir_ui_view.py", line 499, in <listcomp>
if not view.groups_id or (view.groups_id & user_groups)]
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 952, in __get__
self.determine_value(record)
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 1065, in determine_value
self.compute_value(recs)
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 1019, in compute_value
self._compute_value(records)
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 1010, in _compute_value
getattr(records, self.compute)()
File "/usr/lib/python3/dist-packages/odoo/addons/base/ir/ir_ui_view.py", line 260, in _compute_arch
view.arch = pycompat.to_text(arch_fs or view.arch_db)
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 952, in __get__
self.determine_value(record)
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 1055, in determine_value
record._prefetch_field(self)
File "/usr/lib/python3/dist-packages/odoo/models.py", line 2663, in _prefetch_field
result = records.read([f.name for f in fs], load='_classic_write')
File "/usr/lib/python3/dist-packages/odoo/models.py", line 2601, in read
self._read_from_database(stored, inherited)
File "/usr/lib/python3/dist-packages/odoo/models.py", line 2744, in _read_from_database
values[index] = translate(ids[index], values[index])
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 1344, in translate
return self.translate(src_trans.get, value)
File "/usr/lib/python3/dist-packages/odoo/tools/translate.py", line 306, in xml_translate
try:
File "/usr/lib/python3/dist-packages/odoo/tools/translate.py", line 284, in parse_xml
return etree.fromstring(bytes(text, encoding='utf-8'))
File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src/lxml/lxml.etree.c:79003)
File "src/lxml/parser.pxi", line 1843, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:118275)
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
this seems pretty odd since whatever I read suggests to really change the odoo code itself, which I feel I should definitely not do. However, does anyone have a direction on where to look for this issue? I see that in translate.py there is an xml_parse without any encoding, but if this would be the root cause odoo itself shouldn't be working at all, but it works on a fresh instance, so the root cause must be something else then.
Thanks for any hint.
UPDATE: some further details
what i was trying to achieve:
I run an instance of odoo 10 and wante to upgrade it to odoo 11 using https://github.com/OCA/OpenUpgrade. This worked in the sense that all data was migrated and I had a working odoo 11 instance running. Before upgrading i stripped the odoo 10 instance to the bare minimum of modules to not infer any further conflicts/issues while upgrading. Si after upgrading i could log in and all looked good, except when i click on any view that is supposed to display customer data. There i get the error message as describe above. It seems there is some problem during upgrading that causes this issue. But i have no idea what this could be. I though i could export & delete all customers in the old instance, upgrade and import those, but this is of course not possible since there are documents/invoices attached to the customers. So this is not an option...
any ideas?

ResourceException when use GPU but not CPU on Azure

My code is able to build the graph successfully and run graph in CPU mode on Azure ML, but GPU reports a ResourceException in the graph building phase.
I switch between CPU and GPU modes by simply removing device command:
with tf.device('/cpu:0'), tf.name_scope('embedding'): #cpu mode runs fine
with tf.name_scope('embedding'): #gpu mode throw exception
I tried loading less data but didn't work either.
I suspect I missed some steps when set up GPU. Any idea?
Azure error msg:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[78298,300]
[[Node: embedding_matrix/Assign = Assign[T=DT_FLOAT, _class=["loc:#embedding_matrix"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_matrix, embedding_matrix/Initializer/Const)]]
Complete error msg:
Traceback (most recent call last):
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call
return fn(*args)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
status, run_metadata)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[78298,300]
[[Node: embedding_matrix/Assign = Assign[T=DT_FLOAT, _class=["loc:#embedding_matrix"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_matrix, embedding_matrix/Initializer/Const)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "NN.py", line 130, in
sess.run(init)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[78298,300]
[[Node: embedding_matrix/Assign = Assign[T=DT_FLOAT, _class=["loc:#embedding_matrix"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_matrix, embedding_matrix/Initializer/Const)]]
Caused by op 'embedding_matrix/Assign', defined at:
File "NN.py", line 120, in
, trainable=False)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1203, in get_variable
constraint=constraint)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1092, in get_variable
constraint=constraint)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 425, in get_variable
constraint=constraint)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 394, in _true_getter
use_resource=use_resource, constraint=constraint)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 805, in _get_single_variable
constraint=constraint)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 213, in init
constraint=constraint)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 346, in _init_from_args
validate_shape=validate_shape).op
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/gen_state_ops.py", line 57, in assign
use_locking=use_locking, name=name)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[78298,300]
[[Node: embedding_matrix/Assign = Assign[T=DT_FLOAT, _class=["loc:#embedding_matrix"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](embedding_matrix, embedding_matrix/Initializer/Const)]]

Host memory is quite a bit larger that device memory for an N-series machine.
Are you sure you simply aren't exceeding the device capacity?

Python Tensorflow:UnimplementedError: Cast string to int32 is not supported

everyone.
I am new to Tensorflow
When I wrote the code like this in this part :
def get_batch(image,label,new_height,new_width,batch_size,capacity):
image=tf.cast(image,tf.string)
label=tf.cast(image,tf.int32)
input_queue= tf.train.slice_input_producer([image,label])
label=input_queue[1]
image_contents=tf.read_file(input_queue[0])
image=tf.image.decode_jpeg(image_contents,channels=3)
image=tf.image.resize_images(image,(new_height,new_width))
image=tf.image.per_image_standardization(image)
image_batch,label_batch=tf.train.batch([image,label],batch_size=batch_size,capacity=capacity,num_threads=8)
label_batch=tf.reshape(label_batch,[batch_size])
return image_batch,label_batch
Btw. the Args: images ,labels are returned from another function, which read the files that store images and labels.And I define the new_height and new_width as constant when I run this code
I met the error like this:
UnimplementedError (see above for traceback): Cast string to int32 is not supported
[[Node: Cast_1 = Cast[DstT=DT_INT32, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/cpu:0"](Cast/x)]]
Traceback (most recent call last):
File "<ipython-input-1-5c65685872d1>", line 1, in <module>
runfile('C:/Users/yanghang/ugthesis/mean subtraction.py', wdir='C:/Users/yanghang/ugthesis')
File "C:\Users\yanghang\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 710, in runfile
execfile(filename, namespace)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 101, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/yanghang/ugthesis/mean subtraction.py", line 102, in <module>
coord.join(threads)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\tensorflow\python\training\coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\six.py", line 686, in reraise
raise value
File "C:\Users\yanghang\Anaconda3\lib\site-packages\tensorflow\python\training\queue_runner_impl.py", line 234, in _run
sess.run(enqueue_op)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 778, in run
run_metadata_ptr)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "C:\Users\yanghang\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)`
Could u tell me how to solve this problem?
Thanks in advance

As pointed out by #mrry, there was a typo in the code:
label=tf.cast(image,tf.int32)
Which should instead be:
label=tf.cast(label,tf.int32)

Tensorflow: sess.run(x) not working. InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was assigned to /device:GPU:1

I'm using python 3.6(Anaconda) on windows-64bit PC. TensorFlow version that I'm using is TensorFlow-1.2.1. I'm running following simple code in my PC.
import tensorflow as tf
sess = tf.Session()
x1 = tf.constant(5)
x2 = tf.constant(6)
# runs result
print(sess.run(x1))
It is giving me following error.:
Traceback (most recent call last):
File "<ipython-input-64-f7e8ea564f81>", line 7, in <module>
print(sess.run(x1))
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 789, in run
run_metadata_ptr)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:GPU:1 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
[[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:1"](Const_2, Const_3)]]
Caused by op 'MatMul', defined at:
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 227, in <module>
main()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 223, in main
kernel.start()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tornado\ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2821, in run_ast_nodes
if self.run_code(code, result):
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-18-02c5e13ac58a>", line 5, in <module>
product = tf.matmul(matrix1, matrix2)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1816, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1217, in _mat_mul
transpose_b=transpose_b, name=name)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1269, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:GPU:1 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
[[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:1"](Const_2, Const_3)]]
Prior to this it was running just fine. I could run these codes but suddenly is has started showing the above error. I have not made any changes in anaconda environment nor have installed any other package.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Restoring saved model in Tensorflow v1.14 - python-3.x

I was looking into the folder where the model files were saved and found that the model.ckpt.meta file had not been written to disk properly. I reran the training and saved the model and then it worked perfectly.

Related

output and feeb_dict inside session FailedPreconditionError (see above for traceback): Attempting to use uninitialized value

Odoo 11 throws error on customer/partner listing after upgrade from odoo 10

ResourceException when use GPU but not CPU on Azure

Python Tensorflow:UnimplementedError: Cast string to int32 is not supported

Tensorflow: sess.run(x) not working. InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was assigned to /device:GPU:1

Categories

Resources