I'm using nohup to submit jobs in background in the machines I got through BSUB and ssh.
My primary machine is on RHEL, from there I am picking up other AIX machine by BSUB(Submits a job to LSF) and also doing a SSH login to another server.
After getting these two machines, executing a script(inner.sh) there through nohup.
I'm capturing the respective PIDs through echo $$ in the script which I am executing(inner.sh).
After submitting the nohup execution in background, I am exiting both of the machines and landing back to primary RHEL machine.
Now, from this RHEL machine, I'm trying to get the status of the nohup execution by ps -p PID with the help of the previously captured two PIDs, but not getting any process listed.
Top level wrapper script wrapper.sh:
#!/bin/bash
#login to a remote server
ssh -k xyz#abc < env_setup.sh
#picking up a AIX machine form LSF
bsub -q night -Is ksh -i env_setup.sh
ps -p process-<AIX_machine>.pid
#got no output
ps -p process-<server_machine>.pid
#got no output
Script passed to Machines picked up by BSUB/SSH to execute nohup env_setup.sh:
#!/bin/bash
nohup sh /path/to/dir/inner.sh > /path/to/dir/log-<hostname>.out &
exit
The actual script which I am trying to execute in machines picked up by BSUB/SSH inner.sh:
#!/bin/bash
echo $$ > /path/to/dir/process-<hostname>.pid
#hope this would give correct us the PID of the remote machine
#execute some other commands
Now, I am getting two process-<hostname>.pid files updated with two PIDs respectively each for both of the machines.
But ps -p at wrapper script is giving us no output.
I am picking up the process IDs from remote machines and doing ps -p at my local RHEL machine.
Is it the reason I am not getting any status update of those two processes?
Can I do anything else to get the status?
ps give the status of local processes. bsub can be used to get the status of processes on each remote machine.
Related
I just started using slurm and did
salloc -p UBUNTU bash
I started a script, then my system froze for another reason and I had to restart. How can I retrieve the ID of the allocated job so I can reattach and end the script?
You can see the list of your jobs with:
squeue -u $USER
Assuming you connect to your cluster with SSH, if you did not start a terminal multiplexer (such as screen or tmux) your job was most probably killed as soon as your SSH session ended.
I have a command that will SSH and run a script after SSH'ing. The script runs a binary file.
Once the script is done, I can type any key and my local terminal goes back to its normal state. However, since the process is still running in the machine I SSH'ed into, any time it logs to stdout I see it in my local terminal.
How can I ignore this output without monkey patching it on my local machine by passing it to /dev/null? I want to keep the output inside the machine I am SSH'ing to and I want to just leave the SSH altogether after the process starts. I can pass it to /dev/null in the machine, however.
This is an example of what I'm running:
cat ./sh/script.sh | ssh -i ~/.aws/example.pem ec2-user#11.111.11.111
The contents of script.sh looks something like this:
# Some stuff...
# Run binary file
./bin/binary &
Solved it with ./bin/binary &>/dev/null &
Copy the script to the remote machine and then run it remotely. Following commands are executed on your local machine.
$ scp -i /path/to/sshkey /some/script.sh user#remote_machine:/path/to/some/script.sh
# Run the script in the background on the remote machine and pipe the output to a logfile. This will also exit from the SSH session right away.
$ ssh -i /path/to/sshkey \
user#remote_machine "/path/to/some/script.sh &> /path/to/some/logfile &"
Note, logfile will be created on the remote machine.
# View the log file while the process is executing
$ ssh -i /path/to/sshkey user#remote_machine "tail -f /path/to/some/logfile"
When a command is submitted with bsub, it will start a process with res command.
res in turn will start actual command as another process
I want to know pid of this actual command
let's say, I have submitted this command. With bhist -l jobid, we can know pid of res, but unable to find a way to get pid of virtuoso
bsub -I -q interactive virtuoso &
If you run a script that calls virtuoso, you should be able to capture the PID of virtuoso from the script and then output it, something like this should work:
#!/bin/bash
jobs &>/dev/null
virtuoso &
new_job_started="$(jobs -n)"
if [ -n "$new_job_started" ];then
VAR=$!
else
VAR=
fi
echo $VAR
I don't know how useful this will be, since you probably won't be on the same machine that your interactive shell is running so you won't be able to access the process with the pid.
There is a benchmarking process, which should be run on a system. It takes maybe a day, so I would like to run it nohup. I use this command:
nohup bash ./run_terasort.sh > terasort.out 2>&1 &
After that I can see with PID in jobs -l output, but after closing PuTTy it stops(as I can see, when I login again).
This is a KVM virtualized machine.
You are using nohup right from what I know. But you have an issue detecting the process.
jobs -l only give the processes of current session. Rather try the below to display the process started in your initial session:
ps -eafww|grep run_terasort.sh|grep -v grep
I am using SSH to start a background process on a remote server. This is what I have at the moment:
ssh remote_user#server.com "nohup process &"
This works, in that the process does start. But the SSH session itself does not end until I hit Ctr-C.
When I hit Ctr-C, the remote process continues to run in the background.
I would like to place the ssh command in a script that I can run locally, so I would like the ssh session to exit automatically once the remote process has started.
Is there a way to make this happen?
The "-f" option to ssh tells ssh to run the remote command in the background and to return immediately. E.g.,
ssh -f user#host "echo foo; sleep 5; echo bar"
If you type the above, you will get your shell prompt back immediately, you will then see "foo" output. Five seconds later you will then see "bar" output. In the meantime, you could have been using the shell.
When using nohup, make sure you also redirect stdin, stdout and stderr:
ssh user#server 'DISPLAY=:0 nohup xeyes < /dev/null > std.out 2> std.err &'
In this way you will be completely detached from the remote process. Be carefull with using ssh -f user#host... since that will only put the ssh process in the background on the calling side. You can verify this by running a ps -aux | grep ssh on the calling machine and this will show you that the ssh call is still active, but just put in the background.
In my example above I use DISPLAY=:0 since xeyes is an X11 program and I want it started on the remote machine.
You could use screen to run your process on this screen, detach from screen Ctrl-a :detach and exit your current session without problem. Then you can reconnect to SSH and attach to this screen again to continue with your task or check if is finished.
Or you can send the command to an already running screen. Your local script should look like this:
ssh remote_user#server.com
screen -dmS new_screen sh
screen -S new_screen -p 0 -X stuff $'nohup process \n'
exit
For more info see this tutorial
Well this question is almost 10 years old, but I recently had to launch a very long script (taking several hours to complete) on a remote server and I found a way using the crontab.
If can edit your user's crontab on the remote server, connect with ssh to the server, edit the crontab and add an entry that will start your script the next minute. Let's say it's 15h03. Add this line :
4 15 * * * /path/to/your/script.sh
save your crontab, wait a minute for the script to be launched. Then edit again your crontab to remove this entry.
You can then safely exit ssh, even shut down your computer while the script is running.