with the commands
$>squeue -u mnyber004
I can visualize all the submitted jobs on my cluster account (slurm)
16884 ada CPUeq6 mnyber00 R 1-01:26:17 1 srvcnthpc105
16882 ada CPUeq4 mnyber00 R 1-01:26:20 1 srvcnthpc104
16878 ada CPUeq2 mnyber00 R 1-01:26:31 1 srvcnthpc104
20126 ada CPUeq1 mnyber00 R 22:32:28 1 srvcnthpc103
22004 curie WRI_0015 mnyber00 R 16:11 1 srvcnthpc603
22002 curie WRI_0014 mnyber00 R 16:13 1 srvcnthpc603
22000 curie WRI_0013 mnyber00 R 16:14 1 srvcnthpc603
How to cancel all the jobs running on the partition ada?

In your case, scancel offers the appropriate filters, so you can simply run
scancel -u mnyber004 -p ada
Should it not have been the case, a frequent idiom is to use the more powerful filtering properties of squeue and the --format option to build the proper command and then feed it to sh:
squeue -u mnyber004 -p ada --format "scancel %i" | sh
You can play it safer by first saving to a file and then sourcing the file.
squeue -u mnyber004 -p ada --format "scancel %j" > /tmp/


Weka command line attributes arguments

On the command line, I'm able to get this rolling with no problem:
java weka.Run weka.classifiers.timeseries.WekaForecaster -W
"weka.classifiers.functions.MultilayerPerceptron -L 0.01 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H 20 " -t "C:\MyFile.arff" -F DirectionNumeric -L 1 -M 3 -prime 3 -horizon 6 -holdout 100 -G TradeDay -dayofweek -weekend -future
But once I try to put the skip list, I start to get errors saying that it's missing a date that is not in the skip list even though the date is in fact on it:
java weka.Run weka.classifiers.timeseries.WekaForecaster -W "weka.classifiers.functions.MultilayerPerceptron -L 0.01 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H 20 " -t "C:\MyFile.arff" -F DirectionNumeric -L 1 -M 3 -prime 3 -horizon 6 -holdout 100 -G TradeDay -dayofweek -weekend -future -skip ""2014-06-07#yyyy-MM-dd, 2014-06-12"
Does anybody knows how to get this working? Weka is low on documentation as far as I know.
Thank's in advance!
Forget it. I got it, the problem was the 's' must be in capital letters:
instead of

how to use the qsub without root login

I borrowed the SGE system, but it doesn't work when I qsub.
#PBS -l nodes=1:ppn=4
#PBS -N pix2pix
#PBS -o pix2pix.out
#PBS -e pix2pix.err
#PBS -l walltime=72000:00:00
cd /public/home/chensu/others/Guicai/pix2pix-tensorflow
source activate tensorflow
python /public/home/chensu/others/Guicai/pix2pix-tensorflow/ --mode train --output_dir /public/home/chensu/others/Guicai/pix2pix-tensorflow/face2face-model --max_epochs 2000 --input_dir /public/home/chensu/others/Guicai/pix2pix-tensorflow/photos/combined/train --which_direction AtoB --batch_size 4
there are many free queues.
qstat -q
server: admin1
Queue Memory CPU Time Walltime Node Run Que Lm State
---------------- ------ -------- -------- ---- --- --- -- -----
FAT_HIGH -- -- -- -- 0 0 0 E R
high -- -- -- -- 10 0 10 E R
MIC_HIGH -- -- -- -- 0 0 0 E R
GPU_HIGH -- -- -- -- 0 0 0 E R
low -- -- -- -- 4 0 20 E R
batch -- -- -- -- 0 0 20 E R
middle -- -- -- -- 0 0 10 E R
----- -----
14 0
when I qsub my test.pbs with qsub test.pbs
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
20414.MGMT1 chensu GPU_HIGH pix2pix -- 1 4 -- 72000:00: C --
Also there are no log, so I don't know what happened.
Any suggestions will be appreciated

Specify number of CPUs for a job on SLURM

I would like to run multiple jobs on a single node on my cluster. However, when I submit a job, it takes all available CPUs and so remaining jobs are queued. As an example, I made a script that request few resources and submit two jobs that are supposed to run at the same time.
#! /bin/bash
variable=$(seq 0 1 1)
for l in ${variable}
cat << EOF > ${run_thread}
#! /bin/bash
#SBATCH -p normal
#SBATCH --nodes 1
#SBATCH --cpus-per-task 1
#SBATCH --ntasks 1
#SBATCH --threads-per-core 1
#SBATCH --mem=10G
sleep 120
sbatch ${run_thread}
However, one job is running and the other user is pending:
57 normal run_thre user PD 0:00 1 (Resources)
56 normal run_thre user R 0:02 1 node00
The cluster only has one node with 4 sockets with 12 cores and 2 threads each. the output of command scontrol show jobid #job is the following:
UserId=user(1002) GroupId=user(1002) MCS_label=N/A
Priority=4294901755 Nice=0 Account=(null) QOS=(null)
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:51 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2018-03-24T15:34:46 EligibleTime=2018-03-24T15:34:46
StartTime=2018-03-24T15:34:46 EndTime=Unknown Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=normal AllocNode:Sid=node00:13047
ReqNodeList=(null) ExcNodeList=(null)
NumNodes=1 NumCPUs=48 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=10G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=(null) Reservation=(null)
OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)
And the output of scontrol show partition is:
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=YES:4
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=48 TotalNodes=1 SelectTypeParameters=NONE
There is something I don't get with the SLURM system. How can I use only 1 CPU per job and run 48 jobs on the node at the same time?
Slurm is probably configured with
which means that slurm allocates full nodes to jobs and does not allow node sharing among jobs.
You can check with
scontrol show config | grep SelectType
Set a value of select/cons_res to allow node sharing.

pgrep in linux only identify 15 bytes proc name

in my Linux , while I run shell : ps -ef | grep Speed , I got the following :
myid 143410 49092 0 10:21 pts/12 00:00:00 ./OutSpeedyOrderConnection
myid 145492 49053 0 10:35 pts/11 00:00:00 ./SpeedyOrderConnection
That means , the pid of these 2 process are 143410 and 145492 .
Then I run shell : pgrep -l Speed , I got the following :
143410 OutSpeedyOrderC
145492 SpeedyOrderConn
and I run shell : pgrep OutSpeedyOrderC , I got :
pgrep OutSpeedyOrderCo will get nothing !!!!!
look like pgrep will only identify 15 bytes of processname ,
anything I can do to get the right answer while I run
pgrep OutSpeedyOrderConnection ?!

perf : How to check processess running on particular cpu

Is there any option in perf to look into processes running on a particular cpu /core, and how much percentage of that core is taken by each process.
Reference links would be helpful.
perf is intended to do a profiling which is not good fit for your case. You may try to do sampling /proc/sched_debug (if it is compiled in your kernel). For example you may check which process is currently running on CPU:
egrep '^R|cpu#' /proc/sched_debug
cpu#0, 917.276 MHz
R egrep 2614 37730.177313 ...
cpu#1, 917.276 MHz
R bash 2023 218715.010833 ...
By using his PID as a key, you may check how many CPU time in milliseconds it consumed:
grep se.sum_exec_runtime /proc/2023/sched
se.sum_exec_runtime : 279346.058986
However, as #BrenoLeitão mentioned, SystemTap is quite useful for your script. Here is script for your task.
global cputimes;
global cmdline;
global oncpu;
global NS_PER_SEC = 1000000000;
probe scheduler.cpu_on {
oncpu[pid()] = local_clock_ns();
probe scheduler.cpu_off {
if(oncpu[pid()] == 0)
cmdline[pid()] = cmdline_str();
cputimes[pid(), cpu()] <<< local_clock_ns() - oncpu[pid()];
delete oncpu[pid()];
probe timer.s(1) {
printf("%6s %3s %6s %s\n", "PID", "CPU", "PCT", "CMDLINE");
foreach([pid+, cpu] in cputimes) {
cpupct = #sum(cputimes[pid, cpu]) * 10000 / NS_PER_SEC;
printf("%6d %3d %3d.%02d %s\n", pid, cpu,
cpupct / 100, cpupct % 100, cmdline[pid]);
delete cputimes;
It traces moments when process is running on CPU and stops execution on that (due to migration or sleeping) by attaching to scheduler.cpu_on and scheduler.cpu_off probes. Second probe calculates time difference between these events and saves it to cputimes aggregation along with process command line arguments.
timer.s(1) fires once per second -- it walks over aggregation and calculates percentage. Here is sample output for Centos 7 with bash running infinite loop:
0 0 100.16
30 1 0.00
51 0 0.00
380 0 0.02 /usr/bin/python -Es /usr/sbin/tuned -l -P
2016 0 0.08 sshd: root#pts/0 "" "" "" ""
2023 1 100.11 -bash
2630 0 0.04 /usr/libexec/systemtap/stapio -R stap_3020c9e7ba76838179be68cd2390a10c_2630 -F3
I understand that perf is not the proper way to do it, although you can limit perf per CPU, as using perf record -C <cpulist> or even perf stat -c <cpulist>.
The close you are going to see is the context-switch event, but, this is not going to provide you the application names at all.
I think you are going to need something more powerful, as systemtap.
