Four model training simultaneously in a single machine with 8 gpus - multithreading

I have a 8-gpu workstation for deep learning using tensorflow.
The work station specifications are as follows:
Intel Xeon Gold x2
NVIDIA Quadro RTXA6000 x8
RAM 1TB
I use:
Python 3.8
UBUNTU 20.04
Tensorflow 2.8
CUDA 11.2
cuDNN 8.1
I hope to run four jupyter notebooks simultaneously to train four models. I think using 8 gpus for a single model training is not very efficient. However, the notebook died during training and I got several warning messages.
Do you have any ideas on how to fix these errors and are these related to running multiple notebooks?
[E 13:32:28.388 NotebookApp] Uncaught exception in ZMQStream callback
Traceback (most recent call last):
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 584, in _run_callback
f = callback(*args, **kwargs)
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 308, in stream_callback
return callback(self, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/notebook/services/kernels/handlers.py", line 572, in _on_zmq_reply
super()._on_zmq_reply(stream, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/notebook/base/zmqhandlers.py", line 256, in _on_zmq_reply
self.write_message(msg, binary=isinstance(msg, bytes))
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 339, in write_message
return self.ws_connection.write_message(message, binary=binary)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 1086, in write_message
fut = self._write_frame(True, opcode, message, flags=flags)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 1061, in _write_frame
return self.stream.write(frame)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 546, in write
self._handle_write()
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 976, in _handle_write
self._write_buffer.advance(num_bytes)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 182, in advance
assert 0 < size <= self._size
AssertionError
[E 13:32:28.388 NotebookApp] Uncaught exception in zmqstream callback
Traceback (most recent call last):
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 621, in _handle_events
self._handle_recv()
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 650, in _handle_recv
self._run_callback(callback, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 584, in _run_callback
f = callback(*args, **kwargs)
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 308, in stream_callback
return callback(self, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/notebook/services/kernels/handlers.py", line 572, in _on_zmq_reply
super()._on_zmq_reply(stream, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/notebook/base/zmqhandlers.py", line 256, in _on_zmq_reply
self.write_message(msg, binary=isinstance(msg, bytes))
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 339, in write_message
return self.ws_connection.write_message(message, binary=binary)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 1086, in write_message
fut = self._write_frame(True, opcode, message, flags=flags)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 1061, in _write_frame
return self.stream.write(frame)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 546, in write
self._handle_write()
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 976, in _handle_write
self._write_buffer.advance(num_bytes)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 182, in advance
assert 0 < size <= self._size
AssertionError
[E 13:32:28.389 NotebookApp] Exception in callback functools.partial(<function ZMQStream._update_handler.<locals>.<lambda> at 0x7f5da3064310>)
Traceback (most recent call last):
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/ioloop.py", line 740, in _run_callback
ret = callback()
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 705, in <lambda>
self.io_loop.add_callback(lambda: self._handle_events(self.socket, 0))
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 621, in _handle_events
self._handle_recv()
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 650, in _handle_recv
self._run_callback(callback, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 584, in _run_callback
f = callback(*args, **kwargs)
File "/home/super/researchvenv/lib/python3.8/site-packages/zmq/eventloop/zmqstream.py", line 308, in stream_callback
return callback(self, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/notebook/services/kernels/handlers.py", line 572, in _on_zmq_reply
super()._on_zmq_reply(stream, msg)
File "/home/super/researchvenv/lib/python3.8/site-packages/notebook/base/zmqhandlers.py", line 256, in _on_zmq_reply
self.write_message(msg, binary=isinstance(msg, bytes))
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 339, in write_message
return self.ws_connection.write_message(message, binary=binary)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 1086, in write_message
fut = self._write_frame(True, opcode, message, flags=flags)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/websocket.py", line 1061, in _write_frame
return self.stream.write(frame)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 546, in write
self._handle_write()
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 976, in _handle_write
self._write_buffer.advance(num_bytes)
File "/home/super/researchvenv/lib/python3.8/site-packages/tornado/iostream.py", line 182, in advance
assert 0 < size <= self._size
AssertionError
Running multiple jupyter notebooks in a single machine for multiple simultaneous TF model training

Related

Errors in installing PyTorch on python 3.8 on my windows 11

C:\Users\ali_r>pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch
Downloading https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp38-cp38-win_amd64.whl (2255.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/2.3 GB 93.9 kB/s eta 6:39:59
ERROR: Exception:
Traceback (most recent call last):
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\urllib3\response.py", line 438, in _error_catcher
yield
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\urllib3\response.py", line 561, in read
data = self._fp_read(amt) if not fp_closed else b""
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\urllib3\response.py", line 519, in _fp_read
data = self._fp.read(chunk_amt)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\cachecontrol\filewrapper.py", line 90, in read
data = self.__fp.read(amt)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\http\client.py", line 454, in read
n = self.readinto(b)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\http\client.py", line 498, in readinto
n = self.fp.readinto(b)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\cli\base_command.py", line 160, in exc_logging_wrapper
status = run_func(*args)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\cli\req_command.py", line 247, in wrapper
return func(self, options, args)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\commands\install.py", line 419, in run
requirement_set = resolver.resolve(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\resolver.py", line 92, in resolve
result = self._result = resolver.resolve(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\resolvelib\resolvers.py", line 481, in resolve
state = resolution.resolve(requirements, max_rounds=max_rounds)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\resolvelib\resolvers.py", line 348, in resolve
self._add_to_criteria(self.state.criteria, r, parent=None)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\resolvelib\resolvers.py", line 172, in _add_to_criteria
if not criterion.candidates:
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\resolvelib\structs.py", line 151, in bool
return bool(self._sequence)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\found_candidates.py", line 155, in bool
return any(self)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\found_candidates.py", line 143, in
return (c for c in iterator if id(c) not in self._incompatible_ids)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\found_candidates.py", line 47, in _iter_built
candidate = func()
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\factory.py", line 206, in _make_candidate_from_link
self._link_candidate_cache[link] = LinkCandidate(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 297, in init
super().init(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 162, in init
self.dist = self._prepare()
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 231, in _prepare
dist = self._prepare_distribution()
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 308, in _prepare_distribution
return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\operations\prepare.py", line 491, in prepare_linked_requirement
return self._prepare_linked_requirement(req, parallel_builds)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\operations\prepare.py", line 536, in _prepare_linked_requirement
local_file = unpack_url(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\operations\prepare.py", line 166, in unpack_url
file = get_http_url(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\operations\prepare.py", line 107, in get_http_url
from_path, content_type = download(link, temp_dir.path)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\network\download.py", line 147, in call
for chunk in chunks:
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\cli\progress_bars.py", line 53, in _rich_progress_bar
for chunk in iterable:
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_internal\network\utils.py", line 63, in response_chunks
for chunk in response.raw.stream(
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\urllib3\response.py", line 622, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\urllib3\response.py", line 587, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\contextlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "c:\users\ali_r\appdata\local\programs\python\python38\lib\site-packages\pip_vendor\urllib3\response.py", line 443, in _error_catcher
raise ReadTimeoutError(self._pool, None, "Read timed out.")
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='download.pytorch.org', port=443): Read timed out.
i dont know why this happened

Error when using joblib in python with undetected chromedriver

when i use (self.links is an array of strings)
Parallel(n_jobs=2)(delayed(self.buybysize)(link) for link in self.links)
with this function
def buybysize(self, link):
browser = self.browser()
//other commented stuff
def browser(self):
options = uc.ChromeOptions()
options.user_data_dir = self.user_data_dir
options.add_argument(self.add_argument)
driver = uc.Chrome(options=options)
return driver
i get the error
oblib.externals.loky.process_executor._RemoteTraceback:
Traceback (most recent call last):
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 436, in _process_worker
r = call_item()
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 288, in __call__
return self.fn(*self.args, **self.kwargs)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in __call__
return self.func(*args, **kwargs)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
return [func(*args, **kwargs)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
return [func(*args, **kwargs)
File "/home/Me/PycharmProjects/zalando_buy/Zalando.py", line 91, in buybysize
browser = self.browser()
File "/home/Me/PycharmProjects/zalando_buy/Zalando.py", line 38, in browser
driver = uc.Chrome(options=options)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/undetected_chromedriver/__init__.py", line 388, in __init__
self.browser_pid = start_detached(
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/undetected_chromedriver/dprocess.py", line 30, in start_detached
multiprocessing.Process(
File "/usr/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/externals/loky/backend/process.py", line 39, in _Popen
return Popen(process_obj)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 52, in __init__
self._launch(process_obj)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 157, in _launch
pid = fork_exec(cmd_python, self._fds, env=process_obj.env)
AttributeError: 'Process' object has no attribute 'env'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/Me/PycharmProjects/zalando_buy/Start.py", line 4, in <module>
class Start:
File "/home/Me/PycharmProjects/zalando_buy/Start.py", line 7, in Start
zalando.startshopping()
File "/home/Me/PycharmProjects/zalando_buy/Zalando.py", line 42, in startshopping
self.openlinks()
File "/home/Me/PycharmProjects/zalando_buy/Zalando.py", line 50, in openlinks
Parallel(n_jobs=2)(delayed(self.buybysize)(link) for link in self.links)
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/parallel.py", line 1056, in __call__
self.retrieve()
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/parallel.py", line 935, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/home/Me/PycharmProjects/zalando_buy/venv/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
return future.result(timeout=timeout)
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
AttributeError: 'Process' object has no attribute 'env'
Process finished with exit code 1
For me it looks like there are instabilities because undetected chromedriver maybe uses multiprocessing already, but isnt there any way where i can open multiple Browsers with UC and process each iteration parallel?
Edit: i debugged and the error appears after trying to execute this line:
driver = uc.Chrome(options=options)

Airflow scheduler works normally, fails with -D

I have set up Airflow on an AWS EC2 server with Ubuntu 18.04, Python 3.6.9. The DB backend is a 5.7.26 MySQL, I am using a LocalExecutor. The setup is nothing more than:
apt-get install python3 python3-pip python3-venv libmysqlclient-dev
I install Airflow with pip3 install apache-airflow[mysql]==1.10.9, init DB connection, start the webserver and it works both normally and with -D. The scheduler, however, works only when run in the foreground. Trying to run it as a daemon fails with the following trace:
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 37, in <module> args.func(args)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/cli.py", line 75, in wrapper return f(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/bin/cli.py", line 1032, in schedulerjob.run()
File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/base_job.py", line 215, in run session.commit()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 1036, in commit self.transaction.commit()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 503, in commitself._prepare_impl()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 482, in _prepare_impl self.session.flush()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2496, in flushself._flush(objects)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2637, in _flush transaction.rollback(_capture_exception=True)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__exc_value, with_traceback=exc_tb,
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 178, in raise_raise exception
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2597, in _flush flush_context.execute()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute rec.execute(self)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute uow,
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 213, in save_obj) in _organize_states_for_save(base_mapper, states, uowtransaction):
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 374, in _organize_states_for_save base_mapper, uowtransaction, states
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 1602, in _connections_for_states connection = uowtransaction.transaction.connection(base_mapper)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 313, in connection return self._connection_for_bind(bind, execution_options)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 420, in _connection_for_bind conn = self._parent._connection_for_bind(bind, execution_options)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 432, in _connection_for_bind conn = bind._contextual_connect()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 2251, in _contextual_connect self._wrap_pool_connect(self.pool.connect, None),
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 2285, in _wrap_pool_connect return fn()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/pool/base.py", line 363, in connect return _ConnectionFairy._checkout(self)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/pool/base.py", line 804, in _checkout result = pool._dialect.do_ping(fairy.connection)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 138, in do_pingdbapi_connection.ping(False)
File "/usr/local/lib/python3.6/dist-packages/pymysql/connections.py", line 546, in ping self._execute_command(COMMAND.COM_PING, "")
File "/usr/local/lib/python3.6/dist-packages/pymysql/connections.py", line 771, in _execute_command self._write_bytes(packet)
File "/usr/local/lib/python3.6/dist-packages/pymysql/connections.py", line 711, in _write_bytesself._sock.settimeout(self._write_timeout)
OSError: [Errno 9] Bad file descriptor
Exception ignored in: <function _ConnectionRecord.checkout.<locals>.<lambda> at 0x7f2eb025e268>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/pool/base.py", line 503, in <lambda>
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/pool/base.py", line 702, in _finalize_fairy
File "/usr/lib/python3.6/logging/__init__.py", line 1337, in error
File "/usr/lib/python3.6/logging/__init__.py", line 1444, in _log
File "/usr/lib/python3.6/logging/__init__.py", line 1454, in handle
File "/usr/lib/python3.6/logging/__init__.py", line 1516, in callHandlers
File "/usr/lib/python3.6/logging/__init__.py", line 865, in handle
File "/usr/lib/python3.6/logging/__init__.py", line 1071, in emit
File "/usr/lib/python3.6/logging/__init__.py", line 1061, in _open
NameError: name 'open' is not defined
I am trying to set up a server for semi-production purposes, so this is actually a blocker for me. I would be grateful for any advice.
EDIT
I tried using mysql-connector-python==8.0.18 and the scheduler did run as a daemon, but an attempt to open a dag failed with the following trace:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/usr/local/lib/python3.6/dist-packages/flask_admin/base.py", line 69, in inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/flask_admin/base.py", line 368, in _run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/flask_login/utils.py", line 258, in decorated_view
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 384, in view_func
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 290, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line 1559, in tree
start_date=min_date, end_date=base_date, session=session)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/models/dag.py", line 837, in get_task_instances
tis = tis.order_by(TaskInstance.execution_date).all()
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3233, in all
return list(self)
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3389, in __iter__
return self._execute_and_instances(context)
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3414, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 982, in execute
return meth(self, multiparams, params)
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
distilled_params,
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
e, statement, parameters, cursor, context
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1478, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
raise value
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
cursor, statement, parameters, context
File "/home/ubuntu/.local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
cursor.execute(statement, parameters)
File "/home/ubuntu/.local/lib/python3.6/site-packages/mysql/connector/cursor_cext.py", line 272, in execute
self._handle_result(result)
File "/home/ubuntu/.local/lib/python3.6/site-packages/mysql/connector/cursor_cext.py", line 163, in _handle_result
self._handle_resultset()
File "/home/ubuntu/.local/lib/python3.6/site-packages/mysql/connector/cursor_cext.py", line 651, in _handle_resultset
self._rows = self._cnx.get_rows()[0]
File "/home/ubuntu/.local/lib/python3.6/site-packages/mysql/connector/connection_cext.py", line 301, in get_rows
else self._cmysql.fetch_row()
SystemError: <built-in method fetch_row of _mysql_connector.MySQL object at 0x3cdaa10> returned a result with an error set
EDIT 2
What finally did work was using mysqlclient by specifying the SQLAlchemy connection string to begin with mysql+mysqldb. I am not giving this as an answer, because the initial problem persists.

Error running Pylint on windows

I installed pylint via pip (version 9.0.1) on a Windows 7 machine with Python 3.5.0. The installation succeeded, but invoking Pylint returns an error "RuntimeError: Inconsistent hierarchy". Any ideas on how to troubleshoot this?
14:27:19 C:\Users\user2>pylint Traceback (most recent call last):
File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 718, in dispatch
impl = dispatch_cache[cls] File "c:\users\user2\appdata\local\programs\python\python35-32\lib\weakref.py",
line 352, in getitem
return self.data[ref(key)] KeyError:
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 721, in dispatch
impl = registry[cls] KeyError:
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\runpy.py",
line 170, in _run_module_as_main
"main", mod_spec) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\runpy.py",
line 85, in _run_code
exec(code, run_globals) File "C:\Users\user2\AppData\Local\Programs\Python\Python35-32\Scripts\pylint.exe__main__.py",
line 9, in File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint__init__.py",
line 13, in run_pylint
Run(sys.argv[1:]) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\lint.py",
line 1222, in init
linter.load_default_plugins() File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\lint.py",
line 453, in load_default_plugins
checkers.initialize(self) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\checkers__init__.py",
line 114, in initialize
register_plugins(linter, path[0]) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\utils.py",
line 992, in register_plugins
module = modutils.load_module_from_file(join(directory, filename)) File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\astroid\modutils.py",
line 272, in load_module_from_file
return load_module_from_modpath(modpath, path, use_sys) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\astroid\modutils.py",
line 233, in load_module_from_modpath
module = imp.load_module(curname, mp_file, mp_filename, mp_desc) File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\imp.py",
line 234, in load_module
return load_source(name, filename, file) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\imp.py",
line 172, in load_source
module = _load(spec) File "", line 693, in _load File "", line 673, in
_load_unlocked File "", line 662, in exec_module File "", line 222,
in _call_with_frames_removed File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\checkers\python3.py",
line 100, in
class Python3Checker(checkers.BaseChecker): File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\checkers\python3.py",
line 501, in Python3Checker
'sys.version_info < (3, 0)', File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\pylint\checkers\python3.py",
line 496, in
[astroid.extract_node(x).repr_tree() for x in [ File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\astroid\node_classes.py",
line 624, in repr_tree
_repr_tree(self, result, set()) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 743, in wrapper
return dispatch(args[0].class)(*args, **kw) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\astroid\node_classes.py",
line 613, in _repr_node
depth) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 743, in wrapper
return dispatch(args[0].class)(*args, **kw) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\site-packages\astroid\node_classes.py",
line 613, in _repr_node
depth) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 743, in wrapper
return dispatch(args[0].class)(*args, **kw) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 723, in dispatch
impl = _find_impl(cls, registry) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 674, in _find_impl
mro = _compose_mro(cls, registry.keys()) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 662, in _compose_mro
return _c3_mro(cls, abcs=mro) File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 616, in _c3_mro
other_c3_mros = [_c3_mro(base, abcs=abcs) for base in other_bases] File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 616, in
other_c3_mros = [_c3_mro(base, abcs=abcs) for base in other_bases] File
"c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 620, in _c3_mro
[explicit_bases] + [abstract_bases] + [other_bases] File "c:\users\user2\appdata\local\programs\python\python35-32\lib\functools.py",
line 571, in _c3_merge
raise RuntimeError("Inconsistent hierarchy") RuntimeError: Inconsistent hierarchy
16:30:52 C:\Users\user2>

Tensorflow: sess.run(x) not working. InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was assigned to /device:GPU:1

I'm using python 3.6(Anaconda) on windows-64bit PC. TensorFlow version that I'm using is TensorFlow-1.2.1. I'm running following simple code in my PC.
import tensorflow as tf
sess = tf.Session()
x1 = tf.constant(5)
x2 = tf.constant(6)
# runs result
print(sess.run(x1))
It is giving me following error.:
Traceback (most recent call last):
File "<ipython-input-64-f7e8ea564f81>", line 7, in <module>
print(sess.run(x1))
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 789, in run
run_metadata_ptr)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:GPU:1 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
[[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:1"](Const_2, Const_3)]]
Caused by op 'MatMul', defined at:
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 227, in <module>
main()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 223, in main
kernel.start()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 474, in start
ioloop.IOLoop.instance().start()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tornado\ioloop.py", line 887, in start
handler_func(fd_obj, events)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 390, in execute_request
user_expressions, allow_stdin)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 501, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2821, in run_ast_nodes
if self.run_code(code, result):
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-18-02c5e13ac58a>", line 5, in <module>
product = tf.matmul(matrix1, matrix2)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1816, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1217, in _mat_mul
transpose_b=transpose_b, name=name)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\POPEYE.SAILOR\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1269, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:GPU:1 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
[[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/device:GPU:1"](Const_2, Const_3)]]
Prior to this it was running just fine. I could run these codes but suddenly is has started showing the above error. I have not made any changes in anaconda environment nor have installed any other package.

Resources