Goal: run .py files via. dvc.yaml.
There are stages before it, in dvc.yaml, that don't produce the error.
dvc exp run:
(venv) me#ubuntu-pcs:~/PycharmProjects/project$ dvc exp run
Stage 'inference' didn't change, skipping
Running stage 'load_data':
> load_data.py
/bin/bash: line 1: load_data.py: Permission denied
ERROR: failed to reproduce 'load_data': failed to run: load_data.py, exited with 126
dvc repro:
(venv) me#ubuntu-pcs:~/PycharmProjects/project$ dvc repro
Stage 'predict' didn't change, skipping
Stage 'evaluate' didn't change, skipping
Stage 'inference' didn't change, skipping
Running stage 'load_data':
> load_data.py
/bin/bash: line 1: load_data.py: Permission denied
ERROR: failed to reproduce 'load_data': failed to run: pdl1_lung_model/load_data.py, exited with 126
dvc doctor:
DVC version: 2.10.2 (pip)
---------------------------------
Platform: Python 3.9.12 on Linux-5.15.0-46-generic-x86_64-with-glibc2.35
Supports:
webhdfs (fsspec = 2022.5.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
https (aiohttp = 3.8.1, aiohttp-retry = 2.5.2),
s3 (s3fs = 2022.5.0, boto3 = 1.21.21)
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p5
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/nvme0n1p5
Repo: dvc, git
dvc exp run -v:
output.txt
dvc exp run -vv:
output2.txt
Solution 1
.py files weren't running as scripts.
They need to be; if you want to run one .py file per stage in dvc.yaml.
To do so, you want to append Boiler-plate code, at the bottom of each .py file.
if __name__ == "__main__":
# invoke primary function() in .py file, w/ params
Solution 2
chmod 777 ....py
Soution 3
I forgot the python in cmd:
load_data:
cmd: python pdl1_lung_model/load_data.py
I have installed RabbitMQ inside a docker container on Ubuntu 20.04 using the following steps: https://computingforgeeks.com/how-to-install-latest-rabbitmq-server-on-ubuntu-linux/
It works as expected and I can use it in a sample python program.
After installation I exit the container and save it to a new docker image using:
docker commit *my_container_id* *my_new_container_name*
docker save -o image_with_rabbitmq.tar *my_new_container_name*
I then load this new image and run it like so:
docker load -i image_with_rabbitmq.tar
docker run --rm --gpus all -it *my_image_id*
But now I find that I cannot start or use rabbitmq anymore. I get the following error:
service rabbitmq-server status
Distribution failed: {{:shutdown, {:failed_to_start_child, :net_kernel, {:EXIT, :nodistribution}}}, {:child, :undefined, :net_sup_dynamic, {:erl_distribution, :start_link, [[:"rabbitmqcli-184-rabbit#231b2003671d", :shortnames, 15000], false, :net_sup_dynamic]}, :permanent, false, 1000, :supervisor, [:erl_distribution]}}
Here is the startup log message:
cat /var/log/rabbitmq/startup_err
BOOT FAILED
===========
Exception during startup:
error:{badmatch,{error,{{shutdown,{failed_to_start_child,net_kernel, {'EXIT',nodistribution}}},{child,undefined,net_sup_dynamic,{erl_distribution,start_link,[[rabbit_prelaunch_9642#localhost,shortnames],false,net_sup_dynamic]},permanent,false,1000,supervisor,[erl_distribution]}}}}
rabbit_prelaunch_dist:duplicate_node_check/1, line 78
rabbit_prelaunch_dist:setup/1, line 23
rabbit_prelaunch:do_run/0, line 115
rabbit_prelaunch:run_prelaunch_first_phase/0, line 32
supervisor:do_start_child_i/3, line 414
supervisor:do_start_child/2, line 400
supervisor:-start_children/2-fun-0-/3, line 384
supervisor:children_map/4, line 1250
Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,{{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}},{child,undefined,net_sup_dynamic,{erl_distribution,start_link,[[rabbit_prelaunch_9642#localhost,shortnames],false,net_sup_dynamic]},permanent,false,1000,supervisor,[erl_distribution]}}}}}},{rabbit_prelaunch_app,start,[normal,[]]}}})
Crash dump is being written to: erl_crash.dump...done
How can I ensure this starts up properly when my container starts?
Running a program in a MPI process, which exec file is myapp_mpi. The command line below launches the app correctly
mpirun --bind-to none -np 128 myapp_mpi apprun -resethway -noconfout -nsteps 8000 -s benchData.tpr -cpo state.cpt -e ener.edr -dlb no -pin off -v
I now wish to constrain the thread binding with a script pin.sh. The command line below would then produce an error.
mpirun --bind-to none -np 128 pin.sh myapp_mpi apprun -resethway -noconfout -nsteps 8000 -s benchData.tpr -cpo state.cpt -e ener.edr -dlb no -pin off -v
I get the following error
--------------------------------------------------------------------------
Open MPI tried to fork a new process via the "execve" system call but
failed. Open MPI checks many things before attempting to launch a
child process, but nothing is perfect. This error may be indicative
of another problem on the target host, or even something as silly as
having specified a directory for your application. Your job will now
abort.
Local host: machine001
Working dir: /home/user/myapp/bin
Application name: /home/user/myapp/bin/pin.sh
Error: Exec format error
--------------------------------------------------------------------------
mpirun: Forwarding signal 18 to job
mpirun: Forwarding signal 18 to job
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered an
error:
Error code: 1
Error name: (null)
Node: machine001
when attempting to start process rank 0.
--------------------------------------------------------------------------
125 total processes failed to start
File locations are good, a priori. Any clue ?
I am trying to run a nexflow pipeline on some data via the Linux command line, but when I do so, it fails because it fails to create the Conda environment.
It looks like it tries to run the pipeline anyway, despite the environment not being set up properly, and so generates an error message. Any help would be much appreciated. Here is the error message:
Error executing process > 'my_process (1)'
Caused by:
Failed to create Conda environment
command: conda env create --prefix /my_file_path-6bf38a923b48a255f96ea3d66d372e6c --file /my_file_path/environment.yml
status : 143
message:
Here is my environment.yml file:
name: pipeline_name
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- filtlong
- blast==2.5
- minimap2
- samtools
- pysam
- pandas
- matplotlib
- pysamstats
- seaborn
- medaka
- bedtools
- bedops
- seqtk
- bioawk
- sniffles
Not an answer to this question, but if you get a similar failure with a different exit status (120, not 143), try the fix in this thread. Reposting it here:
conda environment from file not working using nextflow · Issue #1081 · nextflow-io/nextflow: https://github.com/nextflow-io/nextflow/issues/1081
pditommaso commented on Mar 18, 2019
The 120 exit status signals that
it was reached the creation timeout. Try increasing it, eg.
conda.createTimeout = '1 h'
The docs for the %run magic -e option state:
ignore sys.exit() calls or SystemExit exceptions in the script being
run. This is particularly useful if IPython is being used to run
unittests, which always exit with a sys.exit() call. In such cases you
are interested in the output of the test results, not in seeing a
traceback of the unittest module.
This works when running scripts but doesn't seem to work when running modules.
So when I type %run -e -m pytest I still get a traceback when a test fails due the SystemExit thrown by pytest which is case mentioned in the docs above that -e is meant to address. I know I can type !pytest but I don't want to wait until pytest completed before I start to see results, and I also want to add the current directory to to module search path.
I am running IPython within Spyder but the behaviour is the same if I run IPython from the Windows command prompt. I there any way of doing what I want and avoiding the distracting traceback?
I ran the following test with %run -m pytest from the spyder ipython console
import pytest
def test_fail():
assert 0
The output was:
============================= test session starts =============================
platform win32 -- Python 3.7.7, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: D:\home\shane\temp\pytest
collected 1 item
test_dummy.py F [100%]
================================== FAILURES ===================================
__________________________________ test_fail __________________________________
def test_fail():
> assert 0
E AssertionError
test_dummy.py:21: AssertionError
=========================== short test summary info ===========================
FAILED test_dummy.py::test_fail - AssertionError
============================== 1 failed in 0.03s ==============================
Traceback (most recent call last):
File "c:\opt\python37\lib\runpy.py", line 205, in run_module
return _run_module_code(code, init_globals, run_name, mod_spec)
File "c:\opt\python37\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "c:\opt\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\opt\python37\lib\site-packages\pytest\__main__.py", line 7, in <module>
raise SystemExit(pytest.main())
SystemExit: ExitCode.TESTS_FAILED