System Details:
Operating System: Ubuntu 19.04
Anaconda version: 2019.03
Python version: 3.7.3
mlflow version: 1.0.0
Steps to Reproduce: https://mlflow.org/docs/latest/tutorial.html
Error at line/command: mlflow models serve -m [path_to_model] -p 1234
Error:
Command 'source activate mlflow-c4536834c2e6e0e2472b58bfb28dce35b4bd0be6 1>&2 && gunicorn --timeout 60 -b 127.0.0.1:1234 -w 4 mlflow.pyfunc.scoring_server.wsgi:app' returned non zero return code. Return code = 1
Terminal Log:
(mlflow) root#user:/home/user/mlflow/mlflow/examples/sklearn_elasticnet_wine/mlruns/0/e3dd02d5d84545ffab858db13ede7366/artifacts/model# mlflow models serve -m $(pwd) -p 1234
2019/06/18 16:15:16 INFO mlflow.models.cli: Selected backend for flavor 'python_function'
2019/06/18 16:15:17 INFO mlflow.pyfunc.backend: === Running command 'source activate mlflow-c4536834c2e6e0e2472b58bfb28dce35b4bd0be6 1>&2 && gunicorn --timeout 60 -b 127.0.0.1:1234 -w 4 mlflow.pyfunc.scoring_server.wsgi:app'
bash: activate: No such file or directory
Traceback (most recent call last):
File "/root/anaconda3/envs/mlflow/bin/mlflow", line 10, in <module>
sys.exit(cli())
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/mlflow/models/cli.py", line 43, in serve
host=host)
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/mlflow/pyfunc/backend.py", line 76, in serve
command_env=command_env)
File "/root/anaconda3/envs/mlflow/lib/python3.7/site-packages/mlflow/pyfunc/backend.py", line 147, in _execute_in_conda_env
command, rc
Exception: Command 'source activate mlflow-c4536834c2e6e0e2472b58bfb28dce35b4bd0be6 1>&2 && gunicorn --timeout 60 -b 127.0.0.1:1234 -w 4 mlflow.pyfunc.scoring_server.wsgi:app' returned non zero return code. Return code = 1
(mlflow) root#user:/home/user/mlflow/mlflow/examples/sklearn_elasticnet_wine/mlruns/0/e3dd02d5d84545ffab858db13ede7366/artifacts/model#
Following the steps mentioned in the GitHub Issue 1507 (https://github.com/mlflow/mlflow/issues/1507) I was able to resolve this issue.
In reference to this post, the "anaconda/bin/" directory is never added to the list of environment variables i.e. PATH variable. In order to resolve this issue, add the "else" part of conda initialize code block from ~/.bashrc file to your PATH variable.
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/atulk/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/home/atulk/anaconda3/etc/profile.d/conda.sh" ]; then
. "/home/atulk/anaconda3/etc/profile.d/conda.sh"
else
export PATH="/home/atulk/anaconda3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
In this case, I added export PATH="/home/atulk/anaconda3/bin:$PATH" to the PATH variable. However, this is just a temporary fix until the issue is fixed in the project.
export PATH=$PATH:/path/to/python/Python/2.7/bin
can be used when you are not using anaconda
Related
I am deploying a Docker container with AWS Lambda which executes a machine learning model. But whenever I attempt to load the saved checkpoints, I am getting a permission denied message when attempting to read the local filesystem or a no directory found based on where I am storing the model...?
Here is the relevant Dockerfile:
FROM public.ecr.aws/lambda/python:3.8
# copy requirements.txt file to the container
COPY requirements.txt ./
# upgrade pip and install the python requirements from requirements.txt
RUN python3.8 -m pip install \
--upgrade pip
RUN python3.8 -m pip install \
-r requirements.txt
# Copy function code
COPY app.py ./
# Install the runtime interface client
RUN python3.8 -m pip install \
awslambdaric
# clean up image for small container
RUN find . -type d -name "tests" -exec rm -rf {} +
RUN find . -type d -name "__pycache__" -exec rm -rf {} +
RUN find . -type d -name "include" -exec rm -rf {} +
RUN rm -rf ./{caffe2,wheel,wheel-*,pkg_resources,boto*,aws*,pip,pip-*,pipenv,setuptools}
RUN rm -rf ./{*.egg-info,*.dist-info}
RUN find . -name \*.pyc -delete
RUN find . -type d -name "test" -exec rm -rf {} +
RUN ls -R -al
# update linux libraries
RUN yum update -y
# install python3 and unzip
RUN yum install -y python3 unzip
# pull model files
RUN mkdir ./model
RUN curl https://somewhere.com/model.zip -o ./model/model.zip
RUN unzip ./model/model.zip -d ./model
RUN chmod 644 ./model
RUN chmod -R 644 ./model/*
RUN rm ./model/model.zip
WORKDIR ./
ENTRYPOINT [ "python3", "-m", "awslambdaric" ]
CMD [ "app.lambda_handler" ]
And the app.py:
from __future__ import print_function
import json, time
import urllib.request
from jose import jwk, jwt
from jose.utils import base64url_decode
from sentence_transformers import SentenceTransformer, util
model = ModelFunction('/model')
def lambda_handler(event, context):
body = json.loads(event['body'])
token = body['jwttoken']
utterance = body['utterance']
comparestring = body['comparestring']
# generate embeddings for each phrase
embeddings1 = model.encode(utterance, convert_to_tensor=True)
embeddings2 = model.encode(comparestring, convert_to_tensor=True)
# compute score
score = util.embeddings(embeddings1, embeddings2)
# output score
print("Score:")
print(score.item())
return {
"statusCode": 200,
"score": json.dums(score.item())
}
# the following is useful to make this script executable in both
# AWS Lambda and any other local environments
if __name__ == '__main__':
event = {
'token': '',
'email': 'somewhere#somewhere.com'}
lambda_handler(event, None)
A couple of the relevant error messages:
[ERROR] PermissionError: [Errno 13] Permission denied: './model/modules.json'
Traceback (most recent call last):
File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
return load_source(name, filename, file)
File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 702, in _load
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/var/task/app.py", line 8, in <module>
model = model('./model')
File "/var/lang/lib/python3.8/site-packages/model/model.py", line 115, in __init__
with open(os.path.join(model_path, 'modules.json')) as fIn:
and
[ERROR] FileNotFoundError: [Errno 2] No such file or directory: '/home/sbx_user1051/.cache/model'
Traceback (most recent call last):
File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
return load_source(name, filename, file)
File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 702, in _load
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/var/task/app.py", line 8, in <module>
model = model('/model')
File "/var/lang/lib/python3.8/site-packages/model/model.py", line 101, in __init__
shutil.rmtree(model_path)
File "/var/lang/lib/python3.8/shutil.py", line 709, in rmtree
onerror(os.lstat, path, sys.exc_info())
File "/var/lang/lib/python3.8/shutil.py", line 707, in rmtree
orig_st = os.lstat(path)
Any ideas? I think I am not storing the model checkpoints in the correct directory? The above logs have been somewhat scrubbed due to NDA etc.
Try running:
chmod 644 $(find . -type f)
chmod 755 $(find . -type d)
on the files you're trying to deploy.
As the AWS Docs say, there may be a permissions issue if the files aren't set to be executable by any user.
Good day all!
I have written a very simple Ansible Role to update all packages to Suse Leap 15.2:
- name: All packages updated
package:
name: "*"
state: latest
but it seems that the Zypper module has a problem with it:
TASK [system_update : All packages updated] ***************************************************************************************************************************************************************************************************
task path: /home/merlin/ansible-kt-linux/roles/system_update/tasks/main.yml:10
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: merlin
<localhost> EXEC /bin/sh -c 'echo ~merlin && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811 `" && echo ansible-tmp-1617094154.778992-48329012899811="` echo /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811 `" ) && sleep 0'
Using module file /usr/lib/python3.6/site-packages/ansible/modules/packaging/os/zypper.py
<localhost> PUT /home/merlin/.ansible/tmp/ansible-local-5239dx5tukgw/tmpvf5upp37 TO /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py
<localhost> EXEC /bin/sh -c 'chmod u+x /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/ /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py && sleep 0'
<localhost> EXEC /bin/sh -c 'sudo -H -S -n -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-qfmrjmpwqhyapufsdqunaohtmlxjucdk ; /usr/bin/python /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py'"'"' && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
Traceback (most recent call last):
File "/home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py", line 102, in <module>
_ansiballz_main()
File "/home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py", line 94, in _ansiballz_main
invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
File "/home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py", line 40, in invoke_module
runpy.run_module(mod_name='ansible.modules.packaging.os.zypper', init_globals=None, run_name='__main__', alter_sys=True)
File "/usr/lib64/python2.7/runpy.py", line 188, in run_module
fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 82, in _run_module_code
mod_name, mod_fname, mod_loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/tmp/ansible_zypper_payload_jYlnfB/ansible_zypper_payload.zip/ansible/modules/packaging/os/zypper.py", line 195, in <module>
ImportError: No module named xml
fatal: [localhost]: FAILED! => {
"changed": false,
"module_stderr": "Traceback (most recent call last):\n File \"/home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py\", line 102, in <module>\n _ansiballz_main()\n File \"/home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py\", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/home/merlin/.ansible/tmp/ansible-tmp-1617094154.778992-48329012899811/AnsiballZ_zypper.py\", line 40, in invoke_module\n runpy.run_module(mod_name='ansible.modules.packaging.os.zypper', init_globals=None, run_name='__main__', alter_sys=True)\n File \"/usr/lib64/python2.7/runpy.py\", line 188, in run_module\n fname, loader, pkg_name)\n File \"/usr/lib64/python2.7/runpy.py\", line 82, in _run_module_code\n mod_name, mod_fname, mod_loader, pkg_name)\n File \"/usr/lib64/python2.7/runpy.py\", line 72, in _run_code\n exec code in run_globals\n File \"/tmp/ansible_zypper_payload_jYlnfB/ansible_zypper_payload.zip/ansible/modules/packaging/os/zypper.py\", line 195, in <module>\nImportError: No module named xml\n",
"module_stdout": "",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 1
}
PLAY RECAP ************************************************************************************************************************************************************************************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
unfortunately I can't read from this what exactly the problem is. Do any of you know the problem?
solved it with shell:
- name: "Install python-xml on Suse"
shell: zypper -n install python-xml
I'm creating a pipeline using snakemake to call methylation in nanopore sequencing data. I've run snakenake using the --dryrun option and the dag is constructed successfully. But when I add the option --profile slurm, I get the following error:
(nanopolish) [danielle.perley#talonhead2 nanopolish-CpG-calling]$ snakemake -np --use-conda --profile slurm test_data/20-001-002/20-001-002_fastq_pass.gz
Building DAG of jobs...
Job counts:
count jobs
1 combine_tech_reps
1
InputFunctionException in line 32 of /home/danielle.perley/nanopolish-CpG-calling/Snakefile:
Error:
SyntaxError: invalid syntax (<string>, line 1)
Wildcards:
sample=20-001-002
Traceback:
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 115, in run_jobs
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 120, in run
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 131, in _run
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 151, in printjob
File "/home/danielle.perley/miniconda3/envs/nanopolish/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 137, in printjob
Line 33 is rule combine_tech_reps in my snakefile. (I'm only showing the first part of my snakefile here)
from snakemake.utils import validate
import pandas as pd
import os.path
import glob
configfile: "config.yaml"
samples_df = pd.read_table(config["samples"],sep = '\t')
samples_df = samples_df.set_index("Sample")
samples = list(samples_df.index.unique())
wildcard_constraints:
sample = "|".join(samples)
def get_fast5(wildcards):
f5 = glob.glob(os.path.join(config["raw_data"],wildcards.sample,"2*","fast5_pass"))
return(f5)
localrules: all,build_index
rule all:
input:
expand("results/Methylation/{sample}_frequency.tsv",sample=samples),
expand("results/alignments/{sample}_flagstat.txt",sample=samples),
expand("resources/QC/{sample}_pycoQC.json",sample=samples),
expand("results/QC/{sample}_pycoQC.html",sample=samples),
"report/multiQC.html"
rule combine_tech_reps:
input:
fqs = lambda wildcards: glob.glob(os.path.join(config["raw_data"],"{sample}","2*","{sample}_fastq_pass.gz").format(sample=wildcards.sample))
output:
fq = os.path.join(config["raw_data"],"{sample}","{sample}_fastq_pass.gz")
shell: """
zcat {input} > {output}
"""
I have a slurm profile file in the directory:
~/.config/snakemake/slurm/config.yaml
jobs: 10
cluster: "sbatch -p talon -t {resources.time} --mem={resources.mem} -c {resources.cpus} -o logs_slurm/{rule}_{wildcards} -e logs_slurm/{rule}_{wildcards}"
default-resources: [cpus=1, mem=2000, time=10:00]
use-conda: true
I'd really like to use this pipeline on our HPC, but I'm not sure what's causing this error.
I was able to solve my problem with the help of this post:
InputFunctionException: unexpected EOF while parsing
By adding the verbose flag:
snakemake -np --verbose --use-conda --profile slurm test_data/20-001-002/20-001-002_fastq_pass.gz
I could see that snakemake was having issues with the default resources:
10:00
^
Changing the default resources line of my config.yaml file:
default-resources: [cpus=1, mem=2000, time=600]
removed the error.
I am not sure if default-resources is a valid key in the config.
What happens if you try this as config.yaml:
jobs: 10
cluster: "sbatch -p talon -t {resources.time} --mem={resources.mem} -c {resources.cpus} -o logs_slurm/{rule}_{wildcards} -e logs_slurm/{rule}_{wildcards}"
use-conda: true
__default__:
time: 10
cpus: 1
mem: 2GB
[QTL-seq:2019-10-09 09:13:37] !!ERROR!! bcftools concat -a -O z -o Chikpea_qtl/30_vcf/qtlseq.vcf.gz Chikpea_qtl/30_vcf/qtlseq.*.vcf.gz >> Chikpea_qtl/log/bcftools.log 2>&1
Failed to open Chikpea_qtl/30_vcf/qtlseq.NW_004516646.1.vcf.gz: could not load index
Traceback (most recent call last):
File "/home/jthakur/Desktop/Software/QTL-seq/qtlseq/mpileup.py", line 191, in concat
check=True)
File "/home/jthakur/anaconda2/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'bcftools concat -a -O z -o Chikpea_qtl/30_vcf/qtlseq.vcf.gz Chikpea_qtl/30_vcf/qtlseq.*.vcf.gz >> Chikpea_qtl/log/bcftools.log 2>&1' returned non-zero exit status 255.
Traceback (most recent call last):
File "/home/jthakur/anaconda2/bin/qtlseq", line 11, in
load_entry_point('qtlseq', 'console_scripts', 'qtlseq')()
File "/home/jthakur/Desktop/Software/QTL-seq/qtlseq/qtlseq.py", line 192, in main
QTLseq(args).run()
File "/home/jthakur/Desktop/Software/QTL-seq/qtlseq/qtlseq.py", line 123, in mpileup
mp.run()
File "/home/jthakur/Desktop/Software/QTL-seq/qtlseq/mpileup.py", line 232, in run
self.concat()
File "/home/jthakur/Desktop/Software/QTL-seq/qtlseq/mpileup.py", line 194, in concat
sys.exit()
NameError: name 'sys' is not defined
I got this shell script from a blog about how to equip git with gitosis.
But i got a "No such file or directory" error after running the script.
[git#209285 ~]$ sudo -H -u git gitosis-init < ~/id_rsa.pub
Traceback (most recent call last):
File "/usr/local/bin/gitosis-init", line 9, in <module>
load_entry_point('gitosis==0.2', 'console_scripts', 'gitosis-init')()
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/app.py", line 24, in run
return app.main()
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/app.py", line 38, in main
self.handle_args(parser, cfg, options, args)
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/init.py", line 138, in handle_args
user=user,
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/init.py", line 75, in init_admin_repository
template=resource_filename('gitosis.templates', 'admin')
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/repository.py", line 63, in init
close_fds=True,
File "/usr/local/lib/python2.7/subprocess.py", line 522, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/local/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/local/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I am puzzled,as the man manual says that:
-H The -H (HOME) option sets the HOME environment variable to the homedir of the target user (root by default) as specified in passwd(5). By default, sudo
does not modify HOME (see set_home and always_set_home in sudoers(5)).
,which is cited from linux manual.
The -H option just sets the HOME environment variable to the homedir of the target user as specified in passwd.
However i specified "/home/git" as homedir for git user in my /etc/passwd file.
apache:x:48:48:Apache:/var/www:/sbin/nologin
git:x:100:101:git version control:/home/git:/bin/bash
duanduan:x:101:500::/home/duanduan:/bin/bash
But why i still got this message? or was incorrect my comprehension of the description in manual?
Append for comments:
And it seems like before with specifying a absolute path.Maybe, it's not the cause.
sudo -H -u git gitosis-init < /home/git/id_rsa.pub
Traceback (most recent call last):
File "/usr/local/bin/gitosis-init", line 9, in <module>
load_entry_point('gitosis==0.2', 'console_scripts', 'gitosis-init')()
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/app.py", line 24, in run
return app.main()
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/app.py", line 38, in main
self.handle_args(parser, cfg, options, args)
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/init.py", line 138, in handle_args
user=user,
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/init.py", line 75, in init_admin_repository
template=resource_filename('gitosis.templates', 'admin')
File "/usr/local/lib/python2.7/site-packages/gitosis-0.2-py2.7.egg/gitosis/repository.py", line 63, in init
close_fds=True,
File "/usr/local/lib/python2.7/subprocess.py", line 522, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/local/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/local/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I guess it is because ~ is expanded by bash before transferring to sudo as a argument, why not try to specify a absolute path for you public key file?