I have my Google Storage Bucket in the following manner:
-data
--labels.pbtxt
--train.record
--test.record
-training
--config file
--packages
And my local machine has the data in /tensorflow/models/research/object_detection in the same manner, additionally
-training
--cloud.yml
And I'm running the following command to start job on google cloud ML engine
gcloud ml-engine jobs submit training object_detection_0.1 --job-
dir=gs://{BUCKET NAME}/training --packages dist/object_detection-
0.1.tar.gz,slim/dist/slim-0.1.tar.gz --module-name object_detection.train --
region us-central1 --config /##/##/models/research/object_detection/training
-- --train_dir=gs://{BUCKET NAME}/training --
pipeline_config_path=gs://{BUCKET NAME}/training/config_file.config
Google cloud logs show me the following error.
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/.local/lib/python2.7/site-packages/object_detection/train.py",
line 49, in <module>
from object_detection import trainer
File "/root/.local/lib/python2.7/site-
packages/object_detection/trainer.py", line 33, in <module>
from deployment import model_deploy
ImportError: No module named deployment
replica worker 0,1,2,3 - same error
The replica worker 4 exited with a non-zero status of 1. Termination reason:
Error.
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/.local/lib/python2.7/site-packages/object_detection/train.py",
line 49, in <module>
from object_detection import trainer
File "/root/.local/lib/python2.7/site-
packages/object_detection/trainer.py", line 33, in <module>
from deployment import model_deploy
ImportError: No module named deployment
replica ps 0,1 -same error
The replica ps 2 exited with a non-zero status of 1. Termination reason:
Error.
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/.local/lib/python2.7/site-packages/object_detection/train.py",
line 49, in <module>
from object_detection import trainer
File "/root/.local/lib/python2.7/site-
packages/object_detection/trainer.py", line 33, in <module>
from deployment import model_deploy
ImportError: No module named deployment
I am having the same problem with the deeplab model. It seems they refer to this folder, because it works for me if I placed were it should to be called properly
By the way...I let me know how you solved it.
Related
When I try to use hyperparameters tuning on Sagemaker I get this error:
UnexpectedStatusException: Error for HyperParameterTuning job imageclassif-job-10-21-47-43: Failed. Reason: No training job succeeded after 5 attempts. Please take a look at the training job failures to get more details.
When I look up the logs on CloudWatch all 5 failed training jobs have the same error at the end:
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/ml/code/train.py", line 117, in <module>
parser.add_argument('--data-dir', type=str, default=os.environ['SM_CHANNEL_TRAINING'])
File "/usr/lib/python3.5/os.py", line 725, in __getitem__
raise KeyError(key) from None
and
KeyError: 'SM_CHANNEL_TRAINING'
The problem is at the Step 4 of the project: https://github.com/petrooha/Deploying-LSTM/blob/main/SageMaker%20Project.ipynb
Would hihgly appreciate any hints on where to look next
In your train.py file, changing the environment variable from
parser.add_argument('--data-dir', type=str, default=os.environ['SM_CHANNEL_TRAINING'])
to
parser.add_argument('--data-dir', type=str, default=os.environ['SM_CHANNEL_TRAIN']) should address the issue.
This is the case with Torch's framework_version 1.3.1 but other versions might also be affected. Here is the link for your reference.
I have a problem with the Spyder software of Python(version 4.0.1) regarding the running kernels in the IPython Console. Accordingly, I have tried many ways to resolve the issue like running some commands in Anaconda prompt or set the settings to the default mode. I even updated the version of my anaconda and the spyder. Nevertheless, nothing has been changed and the issue still exists.
This is the error I am receiving:
Traceback (most recent call last): File
"C:\ProgramData\Anaconda3\lib\runpy.py", line 193, in
_run_module_as_main "main", mod_spec) File "C:\ProgramData\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals) File
"C:\ProgramData\Anaconda3\lib\site‑packages\spyder_kernels\console__main__.py",
line 11, in start.main() File
"C:\ProgramData\Anaconda3\lib\site‑packages\spyder_kernels\console\start.py",
line 287, in main import_spydercustomize() File
"C:\ProgramData\Anaconda3\lib\site‑packages\spyder_kernels\console\start.py",
line 39, in import_spydercustomize import spydercustomize File
"C:\ProgramData\Anaconda3\lib\site‑packages\spyder_kernels\customize\spydercustomize.py",
line 24, in from IPython.core.getipython import get_ipython File
"C:\ProgramData\Anaconda3\lib\site‑packages\IPython__init__.py", line
56, in from .terminal.embed import embed File
"C:\ProgramData\Anaconda3\lib\site‑packages\IPython\terminal\embed.py",
line 14, in from IPython.core.magic import Magics, magics_class,
line_magic File
"C:\ProgramData\Anaconda3\lib\site‑packages\IPython\core\magic.py",
line 20, in from . import oinspect File
"C:\ProgramData\Anaconda3\lib\site‑packages\IPython\core\oinspect.py",
line 30, in from IPython.lib.pretty import pretty File
"C:\ProgramData\Anaconda3\lib\site‑packages\IPython\lib\pretty.py",
line 82, in import datetime File "C:\Users\mahkam\datetime.py", line
4 ^ SyntaxError: EOF while scanning triple‑quoted string literal
(Spyder maintainer here) You need to rename or remove this file
C:\Users\mahkam\datetime.py
That's because that file is using the same name of Python internal module and that confuses other modules that depend on it.
Looks like you have a quoting error
File "C:\Users\mahkam\datetime.py", line 4 ^ SyntaxError: EOF while scanning triple‑quoted string literal
Check out your datetime.py
I want to start my docker-compose and I always get this error.
Docker Desktop tells me I'm logged in. I also rebooted once and logged in again.
I don't quite understand why that's not possible. If I pull other Docker Containers in another project, everything works.
We dont use paython in our project.
$ docker --version
Docker version 19.03.8, build afacb8b
$ docker-compose --version
docker-compose version 1.25.4, build 8d51620a
$ python --version
Python 3.7.4
macOS Catalina 10.15.3
Here is the stacktrace
> docker-compose up
Pulling mongo (mongo:latest)...
Traceback (most recent call last):
File "site-packages/docker/credentials/store.py", line 80, in _execute
File "subprocess.py", line 411, in check_output
File "subprocess.py", line 488, in run
File "subprocess.py", line 800, in __init__
File "subprocess.py", line 1551, in _execute_child
OSError: [Errno 8] Exec format error: '/usr/local/bin/docker-credential-ecr-login'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "site-packages/docker/auth.py", line 264, in _resolve_authconfig_credstore
File "site-packages/docker/credentials/store.py", line 35, in get
File "site-packages/docker/credentials/store.py", line 104, in _execute
docker.credentials.errors.StoreError: Unexpected OS error "Exec format error", errno=8
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "docker-compose", line 6, in <module>
File "compose/cli/main.py", line 72, in main
File "compose/cli/main.py", line 128, in perform_command
File "compose/cli/main.py", line 1077, in up
File "compose/cli/main.py", line 1073, in up
File "compose/project.py", line 548, in up
File "compose/service.py", line 361, in ensure_image_exists
File "compose/service.py", line 1250, in pull
File "compose/progress_stream.py", line 102, in get_digest_from_pull
File "compose/service.py", line 1215, in _do_pull
File "site-packages/docker/api/image.py", line 396, in pull
File "site-packages/docker/auth.py", line 48, in get_config_header
File "site-packages/docker/auth.py", line 324, in resolve_authconfig
File "site-packages/docker/auth.py", line 235, in resolve_authconfig
File "site-packages/docker/auth.py", line 281, in _resolve_authconfig_credstore
docker.errors.DockerException: Credentials store error: StoreError('Unexpected OS error "Exec format error", errno=8')
[52557] Failed to execute script docker-compose
resetting the docker in docker-hub/settings solved the problem.
I'm new to azure and I'm getting this KeyError when deploying my python function on Azure portal, not sure what is the reason.
I have added just one package, "tweepy == 3.8.0" in my requirements.txt and it seems like it is crashing mostly right during it's installation during deployment, And the PySocks package is probably just a dependency for tweepy package.
I have no such issues when the debug it locally. The function runs absolutely fine locally.
How can I resolve this deployment issue?
Error:
There was an error restoring dependencies. Traceback (most recent call last):
File "C:\Users\anjan\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\anjan\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\anjan\AppData\Roaming\npm\node_modules\azure-functions-core-tools\bin\tools\python\packapp\__main__.py", line
234, in <module>
main()
File "C:\Users\anjan\AppData\Roaming\npm\node_modules\azure-functions-core-tools\bin\tools\python\packapp\__main__.py", line
60, in main
find_and_build_deps(args)
File "C:\Users\anjan\AppData\Roaming\npm\node_modules\azure-functions-core-tools\bin\tools\python\packapp\__main__.py", line
142, in find_and_build_deps
wheel.install(paths, maker)
File "C:\Users\anjan\AppData\Roaming\npm\node_modules\azure-functions-core-tools\bin\tools\python\packapp\distlib\wheel.py",
line 519, in install
row = records[u_arcname]
KeyError: 'PySocks-1.7.0.dist-info/'
"func: pack" task has been a common problem for users. I could solve it by trying a preview feature that is meant to address this: https://github.com/microsoft/vscode-azurefunctions/wiki/Server-Side-Build
I followed the official vagrant-dcos instruction to install cassandra with minimal setup, by running command below, and got errors. Any idea?
dcos package install --options=examples/oinker/pkg-cassandra.json cassandra --yes
see error below:
Traceback (most recent call last):
File "cli/dcoscli/subcommand.py", line 101, in run_and_capture
File "cli/dcoscli/package/main.py", line 22, in main
File "cli/dcoscli/util.py", line 22, in wrapper
File "cli/dcoscli/package/main.py", line 36, in _main
File "dcos/cmds.py", line 43, in execute
File "cli/dcoscli/package/main.py", line 322, in _install
File "dcos/packagemanager.py", line 177, in get_package_version
File "dcos/packagemanager.py", line 359, in __init__
File "cli/env/lib/python3.5/site-packages/requests/models.py", line 866, in json
File "json/__init__.py", line 319, in loads
File "json/decoder.py", line 339, in decode
File "json/decoder.py", line 357, in raw_decode
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Seems to be issue on cosmos. I restarted whole vagrant and it worked fine.