How to clear PBS job history - pbs

How can I clear all the PBS jobs that have been finished or are having status 'F'. I just want to see job that are in Queue or running currently. This will shorten the output of qstat command.

With
$ qselect -x -s F
You can view the ID of all the jobs that have been finished.
With
$ qdel -x <job_id>
You can delete a job history.
Combine them together with xargs.
$ qselect -x -s F | xargs qdel -x
This will delete every job history.

Related

How to run shell script commands in an sh file in parallel?

I'm trying to take backup of tables in my database server.
I have around 200 tables. I have a shell script that contains commands to take backups of each table like:
backup.sh
psql -u username ..... table1 ... file1;
psql -u username ..... table2 ... file2;
psql -u username ..... table3 ... file3;
I can run the script and create backups in my machine. But as there are 200 tables, it's gonna run the commands sequentially and takes lot of time.
I want to run the backup commands in parallel. I have seen articles where in they suggested to use && after each command or use nohup command or wait command.
But I don't want to edit the script and include around 200 such commands.
Is there any way to run these list of shell script commands parallelly? something like nodejs does? Is it possible to do it? Or am I looking at it wrong?
Sample command in the script:
psql --host=somehost --port=5490 --username=user --dbname=db -c '\copy dbo.tablename TO "/home/username/Desktop/PostgresFiles/tablename.csv" with DELIMITER ","';
You can leverage xargs to run command in parallel, AND control the number of concurrent jobs. Running 200 backup jobs might overwhelm your database, and result in less than optimal performance.
Assuming you have backup.sh with one backup command per line
xargs -P5 -I{} bash -c "{}" < backup.sh
The commands in backup.sh should be modified to allow quoting (using single quote when possible, escaping double quote):
psql --host=somehost --port=5490 --username=user --dbname=db -c '\copy dbo.tablename TO \"/home/username/Desktop/PostgresFiles/tablename.csv\" with DELIMITER \",\"';
Where -P5 control the number of concurrent jobs. This will be able to process command lines WITHOUT double quotes. For the above script, you change "\copy ..." to '\copy ...'
Simpler alternative will be to use a helper backup-table.sh, which will take two parameters (table, file), and use
xargs -P5 -I{} backup-table.sh "{}" < tables.txt
And put all the complex quoting into the backup-table.sh
doit() {
table=$1
psql --host=somehost --port=5490 --username=user --dbname=db -c '\copy dbo.'$table' TO "/home/username/Desktop/PostgresFiles/'$table'.csv" with DELIMITER ","';
}
export -f doit
sql --listtables -n postgresql://user:pass#host:5490/db | parallel -j0 doit
Is there any logic in the script other than individual commands? (EG: and if's or processing of output?).
If it's just a file with a list of scripts, you could write a wrapper for the script (or a loop from the CLI) EG:
$ cat help.txt
echo 1
echo 2
echo 3
$ while read -r i;do bash -c "$i" &done < help.txt
[1] 18772
[2] 18773
[3] 18774
1
2
3
[1] Done bash -c "$i"
[2]- Done bash -c "$i"
[3]+ Done bash -c "$i"
$ while read -r i;do bash -c "$i" &done < help.txt
[1] 18820
[2] 18821
[3] 18822
2
3
1
[1] Done bash -c "$i"
[2]- Done bash -c "$i"
[3]+ Done bash -c "$i"
Each line of help.txt contains a command and I run a loop where I take each command and run it in subshell. (this is a simple example where I just background each job. You could get more complex using something like xargs -p or parallel but this is a starting point)

Easy way to hold/release jobs by job array task id in slurm

I have a bunch of job arrays that are running right now (SLURM).
For example, 2552376_1, 2552376_10, 2552376_20, 2552376_80, 2552377_1, 2552377_10, 2552377_20, 2552377_80 and so on.
Currently, I am interested in that which end with _1.
Is there any way to hold all others without specifying job ids (because I have several hundreds of them)?
The following command works for holding all the jobs:
squeue -r -t PD -u $USER -o "scontrol hold %i" | tail -n +2 | sh
For releasing the one with needed id I use
squeue -r -u $USER -o "scontrol release %i" | tail -n +2 | grep "_1$" | sh
which picks correct jobs.
Mass update of jobs can be done by abusing the output formatting of squeue:
Hold all your pending jobs:
squeue -r -t PD -u $USER -o "scontrol hold %i" | sh
then release all your jobs ending in _1
squeue -r -t PD -u $USER -o "scontrol release %i" | grep "_1$" | sh
First run the commands without the | sh part to make sure it is working the way intended.
Note the -r option to display one job array element per line.

scontrol all jobs in user account

I am trying to hold all jobs submitted from my account. However, scontrol hold only takes in array and I have many arrays. Is there an alternative command like scancel -u user?
Edit1:
If iterating all job id is the only way, this is my method:
squeue -u user | awk '{print $1;}' | while read jobid; do scontrol hold $jobid; done
While piping formatted text to sh is clever, I would probably do something like this:
squeue -u <user> --format "%i" --noheader | xargs scontrol hold
or
sacct --allocation --user=<user> --noheader --format=jobid | xargs scontrol hold
If you wanted to filter by state, you could do that as well:
squeue -u <user> --format "%i" --noheader --states=PENDING | xargs scontrol hold
or
sacct --allocation --user=<user> --noheader --format=jobid --state=PENDING | xargs scontrol hold
source: Slurm man pages
A often-used method is to (ab)use the formatting possibilities of squeue to build the scontrol line:
squeue -u user --format "scontrol hold job %i"
and then pipe that into a shell:
squeue -u user --format "scontrol hold job %i" | sh

List number of jobs of each status

Is there a simple way to get SLURM to print out, for a given user, the number of jobs of each status (e.g., running, pending, completed, failed, etc.)?
One way to get that information is with:
squeue -u $USER -o%T -ST | uniq -c
The -u argument will filter jobs for the specific user, the -o%T argument will only output the job state, and the -S argument will sort them. Then uniq -c will do the counting.
Example output:
$ squeue -u $USER -o%T -ST | uniq -c
147 PENDING
49 RUNNING

KSH: constraint the number of thread that can run at one time

I have a script that loop and each iteration invoke a thread that run in a background like below
xn_run_process.sh
...
for each in `ls ${INPUT_DIR}/MDX*.txt`
do
java -Xms256m -Xmx1024m -cp ${CLASSPATH} com.wf.xn.etcc.Main -config=${CONFIG_FILE}
...
for SCALE_PDF in `ls ${PROCESS_DIR}/*.pdf`
do
OUTPUT_AFP=${OUTPUT_DIR}/`basename ${SCALE_PDF}`
OUTPUT_AFP=`print ${OUTPUT_AFP} | sed s/pdf/afp/g`
${PROJ_DIR}/myscript.sh -i ${SCALE_PDF} -o ${OUTPUT_AFP} &
sleep 30
done
done
When I did this, I only think that it will be only 5 threads of myscript.sh be concurrently executed at one time, however things change, and this list execute 30 threads, each does quite heavy process. How do I constraint the number of concurrent processes to 5?
While this is possible in pure shell scripting, the easiest approach would be using a parallelization tool like GNU parallel or GNU make. Makefile example:
SOURCES = ${SOME_LIST}
STAMPS = $(SOME_LIST:=.did-run-stamp)
all : $(STAMPS)
%.did-run-stamp : %
/full/path/myscript.sh -f $<
and then calling make as make -j 5.
Use GNU Parallel (adjust -j as you see fit. Remove it if you want # of CPUs):
for each in `ls ${INPUT_DIR}/MDX*.txt`
do
java -Xms256m -Xmx1024m -cp ${CLASSPATH} com.wf.xn.etcc.Main -config=${CONFIG_FILE}
...
for SCALE_PDF in `ls ${PROCESS_DIR}/*.pdf`
do
OUTPUT_AFP=${OUTPUT_DIR}/`basename ${SCALE_PDF}`
OUTPUT_AFP=`print ${OUTPUT_AFP} | sed s/pdf/afp/g`
sem --id myid -j 5 ${PROJ_DIR}/myscript.sh -i ${SCALE_PDF} -o ${OUTPUT_AFP}
done
done
sem --wait --id myid
sem is part of GNU Parallel.
This will keep 5 jobs running until there is only 5 jobs left. Then it will allow your java to run while finishing the last 5. The sem --wait will wait until the last 5 are finished, too.
Alternatively:
for each ...
java ...
...
ls ${PROCESS_DIR}/*.pdf |
parallel -j 5 ${PROJ_DIR}/myscript.sh -i {} -o ${OUTPUT_DIR}/{/.}.afp
done
This will run 5 jobs in parallel and only let java run when all the jobs are finished.
Alternatively you can use the queue trick described in GNU Parallel's man page: https://www.gnu.org/software/parallel/man.html#example__gnu_parallel_as_queue_system_batch_manager
echo >jobqueue; tail -f jobqueue | parallel -j5 &
for each ...
...
ls ${PROCESS_DIR}/*.pdf |
parallel echo ${PROJ_DIR}/myscript.sh -i {} -o ${OUTPUT_DIR}/{/.}.afp >> jobqueue
done
echo killall -TERM parallel >> jobqueue
wait
This will run java, then add jobs to be run to a queue. After adding jobs java will be run immediately. At all time 5 jobs will be run from the queue until the queue is empty.
You can install GNU Parallel simply by:
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
Watch the intro videos to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and walk through the tutorial (man parallel_tutorial). You command line with love you for it.
If you have ksh93 check if JOBMAX is available:
JOBMAX
This variable defines the maximum number running background
jobs that can run at a time. When this limit is reached, the
shell will wait for a job to complete before staring a new job.

Resources