Get ID of shell started using salloc - slurm

I just started using slurm and did
salloc -p UBUNTU bash
I started a script, then my system froze for another reason and I had to restart. How can I retrieve the ID of the allocated job so I can reattach and end the script?

You can see the list of your jobs with:
squeue -u $USER
Assuming you connect to your cluster with SSH, if you did not start a terminal multiplexer (such as screen or tmux) your job was most probably killed as soon as your SSH session ended.

Related

Bash script how to run a command remotely and then exit the remote terminal

I'm trying to execute the command:
ssh nvidia#ubuntu-ip-address "/opt/ads2/arm-linux64/bin/ads2 svcd&"
This works so far except that it hangs in the remote terminal when "/opt/ads2/arm-linux64/bin/ads2 svcd&" is executed, unless i enter ctrl+c. So I'm looking for a command that, after executing the command, exits from the remote terminal and continue executing the local bash script.
thanks in advance
When you run a command in background on a terminal, regardless of weather it be local or remotely, if you attempt to logout most systems will warn you have running jobs. One further attempt to logout and your jobs get killed as you exit.
In order to avoid this you need to detach your running jobs from terminal.
if job is already running you can
disown -h <jobspec ar reported by jobs>
If you want to run something in background and then exit leaving it running you can use nohup
nohup command &
This is certainly ok on init systems ... not sure if it works exactly like this on systems that use systemd.

Get PID of the jobs submitted by nohup in Linux

I'm using nohup to submit jobs in background in the machines I got through BSUB and ssh.
My primary machine is on RHEL, from there I am picking up other AIX machine by BSUB(Submits a job to LSF) and also doing a SSH login to another server.
After getting these two machines, executing a script(inner.sh) there through nohup.
I'm capturing the respective PIDs through echo $$ in the script which I am executing(inner.sh).
After submitting the nohup execution in background, I am exiting both of the machines and landing back to primary RHEL machine.
Now, from this RHEL machine, I'm trying to get the status of the nohup execution by ps -p PID with the help of the previously captured two PIDs, but not getting any process listed.
Top level wrapper script wrapper.sh:
#!/bin/bash
#login to a remote server
ssh -k xyz#abc < env_setup.sh
#picking up a AIX machine form LSF
bsub -q night -Is ksh -i env_setup.sh
ps -p process-<AIX_machine>.pid
#got no output
ps -p process-<server_machine>.pid
#got no output
Script passed to Machines picked up by BSUB/SSH to execute nohup env_setup.sh:
#!/bin/bash
nohup sh /path/to/dir/inner.sh > /path/to/dir/log-<hostname>.out &
exit
The actual script which I am trying to execute in machines picked up by BSUB/SSH inner.sh:
#!/bin/bash
echo $$ > /path/to/dir/process-<hostname>.pid
#hope this would give correct us the PID of the remote machine
#execute some other commands
Now, I am getting two process-<hostname>.pid files updated with two PIDs respectively each for both of the machines.
But ps -p at wrapper script is giving us no output.
I am picking up the process IDs from remote machines and doing ps -p at my local RHEL machine.
Is it the reason I am not getting any status update of those two processes?
Can I do anything else to get the status?
ps give the status of local processes. bsub can be used to get the status of processes on each remote machine.

Running process nohup

There is a benchmarking process, which should be run on a system. It takes maybe a day, so I would like to run it nohup. I use this command:
nohup bash ./run_terasort.sh > terasort.out 2>&1 &
After that I can see with PID in jobs -l output, but after closing PuTTy it stops(as I can see, when I login again).
This is a KVM virtualized machine.
You are using nohup right from what I know. But you have an issue detecting the process.
jobs -l only give the processes of current session. Rather try the below to display the process started in your initial session:
ps -eafww|grep run_terasort.sh|grep -v grep

Use SSH to start a background process on a remote server, and exit session

I am using SSH to start a background process on a remote server. This is what I have at the moment:
ssh remote_user#server.com "nohup process &"
This works, in that the process does start. But the SSH session itself does not end until I hit Ctr-C.
When I hit Ctr-C, the remote process continues to run in the background.
I would like to place the ssh command in a script that I can run locally, so I would like the ssh session to exit automatically once the remote process has started.
Is there a way to make this happen?
The "-f" option to ssh tells ssh to run the remote command in the background and to return immediately. E.g.,
ssh -f user#host "echo foo; sleep 5; echo bar"
If you type the above, you will get your shell prompt back immediately, you will then see "foo" output. Five seconds later you will then see "bar" output. In the meantime, you could have been using the shell.
When using nohup, make sure you also redirect stdin, stdout and stderr:
ssh user#server 'DISPLAY=:0 nohup xeyes < /dev/null > std.out 2> std.err &'
In this way you will be completely detached from the remote process. Be carefull with using ssh -f user#host... since that will only put the ssh process in the background on the calling side. You can verify this by running a ps -aux | grep ssh on the calling machine and this will show you that the ssh call is still active, but just put in the background.
In my example above I use DISPLAY=:0 since xeyes is an X11 program and I want it started on the remote machine.
You could use screen to run your process on this screen, detach from screen Ctrl-a :detach and exit your current session without problem. Then you can reconnect to SSH and attach to this screen again to continue with your task or check if is finished.
Or you can send the command to an already running screen. Your local script should look like this:
ssh remote_user#server.com
screen -dmS new_screen sh
screen -S new_screen -p 0 -X stuff $'nohup process \n'
exit
For more info see this tutorial
Well this question is almost 10 years old, but I recently had to launch a very long script (taking several hours to complete) on a remote server and I found a way using the crontab.
If can edit your user's crontab on the remote server, connect with ssh to the server, edit the crontab and add an entry that will start your script the next minute. Let's say it's 15h03. Add this line :
4 15 * * * /path/to/your/script.sh
save your crontab, wait a minute for the script to be launched. Then edit again your crontab to remove this entry.
You can then safely exit ssh, even shut down your computer while the script is running.

How to start a process that won't end when my ssh session ends?

Somehow this isn't yielded by a google search.
I'm scripting a server in node.js. I start the server by executing its script with the node program:
node myserver.js
But the server staying up is dependent on my ssh session. How can I make this (and all such processes) persistent? Init.d?
Use the nohup command:
From http://en.wikipedia.org/wiki/Nohup
nohup is a POSIX command to ignore the HUP (hangup) signal, enabling the command to keep running after the user who issues the command has logged out. The HUP (hangup) signal is by convention the way a terminal warns depending processes of logout.
Try this:
nohup node myserver.js &
Have you tried GNU screen? Using it, a process can continue to run when you end your ssh session. nohup is a standard *nix command that will allow you to do the same, albeit in a more limited way.
Use screen. Type screen from the terminal and then launch your process. If you disconnect you can reconnect to the ssh session, by typing 'screen -ls' (to see active screens) and 'screen -r' to reconnect.
The program needs to run in a daemonized mode. Here's a good post for doing this in Ubuntu.
nohup is good to run the job under. If the job is already running, you can try disown -h (at least in bash)

Resources