report memory and cpu usage - matlab - on multicore linux server - linux

we would need to know how much memory and cpu time a matlab process had used with all of it's spawned threads. If I understand it correctly, all the threads will pop up as new processes with new process-ids but the CMD name will remain the same.
so I thought about creating a demon which append the usage in every n sec:
ps -o %cpu,%mem,cmd -C MATLAB | grep "[0-9]+" >> matlab_log
and later counting and summing up the ratios multiplied by the demon tick time.
I wonder if there is an easier way, or I missing something, or simply just exist some tool more handy for this job?
Cheers

If you install the BSD Process Accounting utilities (package acct on Debian and Ubuntu) you can use the sa(8) utility to summarize executions or give you semi-detailed execution logs:
$ lastcomm
...
man F X sarnold pts/3 0.00 secs Fri May 4 16:21
man F X sarnold pts/3 0.00 secs Fri May 4 16:21
vim sarnold pts/3 0.05 secs Fri May 4 16:20
sa sarnold pts/3 0.00 secs Fri May 4 16:20
sa sarnold pts/3 0.00 secs Fri May 4 16:20
bzr sarnold pts/3 0.99 secs Fri May 4 16:19
apt-get S root pts/1 0.44 secs Fri May 4 16:18
dpkg root pts/1 0.00 secs Fri May 4 16:19
dpkg root pts/1 0.00 secs Fri May 4 16:19
dpkg root pts/1 0.00 secs Fri May 4 16:19
apt-get F root pts/1 0.00 secs Fri May 4 16:19
...
$ sa
633 15.22re 0.09cp 0avio 6576k
24 8.51re 0.03cp 0avio 6531k ***other*
2 0.31re 0.02cp 0avio 10347k apt-get
3 0.02re 0.02cp 0avio 9667k python2.7
18 0.04re 0.01cp 0avio 5444k dpkg
2 0.01re 0.01cp 0avio 13659k debsums
...
The format of the acct file is documented in acct(5), so you could write your own programs to parse the files if none of the standard tools lets you express the queries you want.
Probably the largest downside to the BSD process accounting utilities is that the kernel will only update the process accounting log when processes exit, because many of the summary numbers are only available once another process wait(2)s for it -- so currently running processes are completely overlooked by the utilities.
These utilities may be sufficient though; these utilities is how compute centers billed their clients, back when compute centers were popular...

You can also use top:
top -b -n 1 | grep MATLAB
14226 user 39 19 2476m 1.4g 26m S 337.2 9.2 24:44.60 MATLAB
25878 user 39 19 2628m 1.6g 26m S 92.0 10.6 21:07.36 MATLAB
14363 user 39 19 2650m 1.4g 26m S 79.7 9.1 23:58.38 MATLAB
14088 user 39 19 2558m 1.4g 26m S 61.3 9.1 25:14.53 MATLAB
14648 user 39 19 2629m 1.6g 26m S 55.2 10.5 22:03.20 MATLAB
14506 user 39 19 2613m 1.5g 26m S 49.0 9.4 22:32.47 MATLAB
14788 user 39 19 2599m 1.6g 26m S 49.0 10.3 20:44.78 MATLAB
25650 user 39 19 2608m 1.6g 26m S 42.9 10.2 25:08.38 MATLAB
or to get fieldnames too:
top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 | grep MATLAB
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14226 user 39 19 2476m 1.4g 26m S 337.2 9.2 24:44.60 MATLAB
25878 user 39 19 2628m 1.6g 26m S 92.0 10.6 21:07.36 MATLAB
14363 user 39 19 2650m 1.4g 26m S 79.7 9.1 23:58.38 MATLAB
14088 user 39 19 2558m 1.4g 26m S 61.3 9.1 25:14.53 MATLAB
14648 user 39 19 2629m 1.6g 26m S 55.2 10.5 22:03.20 MATLAB
14506 user 39 19 2613m 1.5g 26m S 49.0 9.4 22:32.47 MATLAB
14788 user 39 19 2599m 1.6g 26m S 49.0 10.3 20:44.78 MATLAB
25650 user 39 19 2608m 1.6g 26m S 42.9 10.2 25:08.38 MATLAB

Related

linux used memory by unknown (not Slab)

Memory occupied by unknown (VMware/CentOS)
Hello.
We have a server that has memory full used issue, but can not find what is eating memory.
Usage of memory has increased few days ago 40% -> neary 100% and stayed there since then.
We’d like to kill whatever eating memory.
[Env]
cat /etc/redhat-release
CentOS release 6.5 (Final)
# arch
x86_64
[status]
#free
total used free shared buffers cached
Mem: 16334148 15682368 651780 0 10168 398956
-/+ buffers/cache: 15273244 1060904
Swap: 8388600 129948 8258652
Result of top (some info are masked with ???)
#top -a
top - 10:19:14 up 49 days, 11:13, 1 user, load average: 1.05, 1.05, 1.10
Tasks: 145 total, 1 running, 143 sleeping, 0 stopped, 1 zombie
Cpu(s): 11.1%us, 18.4%sy, 0.0%ni, 69.5%id, 0.8%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 16334148k total, 15684824k used, 649324k free, 9988k buffers
Swap: 8388600k total, 129948k used, 8258652k free, 387824k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17940 ??? 20 0 7461m 6.5g 6364 S 16.6 41.5 1113:27 java
4982 ??? 20 0 941m 531m 5756 S 2.7 3.3 611:22.48 java
3213 root 20 0 2057m 354m 2084 S 99.8 2.2 988:43.79 python
28270 ??? 20 0 835m 157m 5464 S 0.0 1.0 106:48.55 java
1648 root 20 0 197m 10m 1452 S 0.0 0.1 42:35.95 python
1200 root 20 0 246m 7452 808 S 0.0 0.0 2:37.42 rsyslogd
Processes that are using memory (some info are masked with ???)
# ps aux --sort rss
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1200 0.0 0.0 251968 7452 ? Sl Sep12 2:37 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root 1648 0.0 0.0 202268 10604 ? Ss Sep12 42:36 /usr/lib64/???
??? 28270 0.1 0.9 855932 161092 ? Sl Sep14 106:49 /usr/java/???
root 3213 96.1 2.0 2107704 332932 ? Ssl Oct31 992:25 /usr/lib64/???
??? 4982 0.8 3.3 964096 544328 ? Sl Sep12 611:25 /usr/java/???
??? 17940 6.6 41.5 7649356 6781076 ? Sl Oct20 1113:49 /usr/java/???
Memory is almost 100% used, but with ps and top, we can only find processes that uses half of it.
We have checked slab cache, but it was not the cause.
Slab is only 90444 kB.
Nothing is found in syslog too.
Anyone has any idea how to detect what is eating memory?
Thank you in advance.
Run free -m and see the difference. Column available shows real free memory.
And take a look at the https://www.linuxatemyram.com/
we have restarted server and solved this case.

Unknown process of jenkins - "kxjdhendlvie" [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I'm running Jenkins 2.38 on Ubuntu 14.04.5 LTS, EC2 instance AWS
Here is output of top command
top - 08:53:12 up 1 day, 39 min, 2 users, load average: 1.37, 1.37, 1.38
Tasks: 128 total, 2 running, 126 sleeping, 0 stopped, 0 zombie
%Cpu(s): 36.1 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 63.9 st
MiB Mem: 2000.484 total, 1916.172 used, 84.312 free, 420.863 buffers
MiB Swap: 4095.996 total, 5.953 used, 4090.043 free. 280.828 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3366 jenkins 20 0 231944 2976 560 S 94.9 0.1 1050:34 kxjdhendlvie
1119 mysql 20 0 1136676 463672 1996 S 1.0 22.6 29:43.49 mysqld
1578 www-data 20 0 490352 4644 1020 S 0.7 0.2 5:16.63 apache2
28038 root 20 0 23696 1664 1144 R 0.3 0.1 0:00.05 top
kxjdhendlvie has PID = 3366, I have never seen this before
and we have nothing on google about this process of Jenkins, too
root#build:/proc/3366# ps aux | grep jenkins
jenkins 1233 0.0 0.0 18752 340 ? S May29 0:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/var/lib/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid -- /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080
jenkins 1234 0.8 21.8 1655032 448576 ? Sl May29 12:56 /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080
jenkins 3366 88.1 0.1 231944 2976 ? Sl May29 1076:10 ./kxjdhendlvie -c hjyfsnkfs.conf
Directory of 3366
root#build:/proc/3366# ll -rth
total 0
dr-xr-xr-x 141 root root 0 May 29 08:13 ../
dr-xr-xr-x 9 jenkins jenkins 0 May 29 13:00 ./
-r--r--r-- 1 jenkins jenkins 0 May 29 13:00 status
-r--r--r-- 1 jenkins jenkins 0 May 29 13:00 stat
-r--r--r-- 1 jenkins jenkins 0 May 29 13:00 cmdline
-r--r--r-- 1 jenkins jenkins 0 May 29 13:27 statm
-r-------- 1 jenkins jenkins 0 May 29 16:27 environ
lrwxrwxrwx 1 jenkins jenkins 0 May 30 06:39 exe -> /var/tmp/kxjdhendlvie (deleted)
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 wchan
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 uid_map
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 timers
dr-xr-xr-x 6 jenkins jenkins 0 May 30 08:36 task/
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 syscall
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 stack
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 smaps
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 setgroups
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 sessionid
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 schedstat
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 sched
lrwxrwxrwx 1 jenkins jenkins 0 May 30 08:36 root -> //
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 projid_map
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 personality
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 pagemap
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 oom_score_adj
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 oom_score
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 oom_adj
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 numa_maps
dr-x--x--x 2 jenkins jenkins 0 May 30 08:36 ns/
dr-xr-xr-x 5 jenkins jenkins 0 May 30 08:36 net/
-r-------- 1 jenkins jenkins 0 May 30 08:36 mountstats
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 mounts
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 mountinfo
-rw------- 1 jenkins jenkins 0 May 30 08:36 mem
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 maps
dr-x------ 2 jenkins jenkins 0 May 30 08:36 map_files/
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 loginuid
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 limits
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 latency
-r-------- 1 jenkins jenkins 0 May 30 08:36 io
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 gid_map
dr-x------ 2 jenkins jenkins 0 May 30 08:36 fdinfo/
dr-x------ 2 jenkins jenkins 0 May 30 08:36 fd/
lrwxrwxrwx 1 jenkins jenkins 0 May 30 08:36 cwd -> /var/tmp/
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 cpuset
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 coredump_filter
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 comm
--w------- 1 jenkins jenkins 0 May 30 08:36 clear_refs
-r--r--r-- 1 jenkins jenkins 0 May 30 08:36 cgroup
-r-------- 1 jenkins jenkins 0 May 30 08:36 auxv
-rw-r--r-- 1 jenkins jenkins 0 May 30 08:36 autogroup
dr-xr-xr-x 2 jenkins jenkins 0 May 30 08:36 attr/
I see nothing related to kxjdhendlvie in /var/tmp/, maybe it's deleted, but process is still running
Does anyone have idea related to it? Please help me investigate
./kxjdhendlvie -c hjyfsnkfs.conf
here is hjyfsnkfs.conf
{
"url" : "stratum+tcp://188.165.214.76:80",
"url" : "stratum+tcp://176.31.117.82:80",
"url" : "stratum+tcp://94.23.8.105:80",
"url" : "stratum+tcp://37.59.51.212:80",
"user" : "46v8xnTsBVx6BzPxb1JAGAj2fURbn6ne59sTa6kg8WEbX1yAoArxwUyMENKfFLJZ6A8b2EqDfSEaB5puwMvVyytfLmR2NoN",
"pass" : "x",
"algo" : "cryptonight",
"quiet" : true
}
You Jenkins instance might have been compromised by this security exploit, https://groups.google.com/forum/m/#!topic/jenkinsci-advisories/sN9S0x78kMU! I suggest that you update your Jenkins installation...

Linux "top" command - want to aggregate resource usage to the process group or user name, especially for postgres

An important topic in software deveopment / programming is to assess the size of the product, and to match the application footprint to the system where it is running. One may need to optimize the product, and/or one may need to add more memory, use a faster processor, etc. In the case of virtual machines, it is important to make sure the application will work effectively by perhaps making the VM memory size larger, or allow a product to get more resources from the hypervisor when needed and available.
The linux top(1) command is great, with its ability to sort by different fields, add optional fields, highlight sort criteria on-screen, and switch sort field with < and >. On most systems though, there are very many processes running, making "at-a-glance" examination a little difficult. Consider:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ PPID SWAP nFLT COMMAND
2181 root 20 0 7565m 3.2g 7028 S 2.7 58.3 86:41.17 1 317m 10k java
1751 root 20 0 137m 2492 1056 S 0.0 0.0 0:02.57 1 5104 76 munin-node
11598 postgres 20 0 146m 23m 11m S 0.0 0.4 7:51.63 2143 3600 28 postmaster
1470 root 20 0 243m 1792 820 S 0.0 0.0 0:01.89 1 2396 23 rsyslogd
3107 postgres 20 0 146m 26m 11m S 0.0 0.5 7:40.61 2143 936 58 postmaster
3168 postgres 20 0 132m 14m 11m S 0.0 0.2 8:27.27 2143 904 53 postmaster
3057 postgres 20 0 138m 19m 11m S 0.0 0.3 6:55.63 2143 856 36 postmaster
3128 root 20 0 85376 900 896 S 0.0 0.0 0:00.11 1636 852 2 sshd
1728 root 20 0 80860 1080 952 S 0.0 0.0 0:00.61 1 776 0 master
3130 manager 20 0 85532 844 672 S 0.0 0.0 0:01.03 3128 712 36 sshd
436 root 16 -4 11052 264 260 S 0.0 0.0 0:00.01 1 688 0 udevd
2211 root 18 -2 11048 220 216 S 0.0 0.0 0:00.00 436 684 0 udevd
2212 root 18 -2 11048 220 216 S 0.0 0.0 0:00.00 436 684 0 udevd
1636 root 20 0 66176 524 436 S 0.0 0.0 0:00.12 1 620 25 sshd
1486 root 20 0 229m 2000 1648 S 0.0 0.0 0:00.79 1485 596 116 sssd_be
2306 postgres 20 0 131m 11m 9m S 0.0 0.2 0:01.21 2143 572 64 postmaster
3055 postgres 20 0 135m 16m 11m S 0.0 0.3 10:18.88 2143 560 36 postmaster
...etc... This is for about 20 processes, but there are well over 100 processes.
In this example I was sorting by SWAP field.
I would like to be able to aggregate related processes based on the "process group" of which they are a part, or based on the USER running the process, or based on the COMMAND being run. Essentially I want to:
Aggregate by PPID, or
Aggregate by USER, or
Aggregate by COMMAND, or
Turn off aggregation
This would allow me to see more quickly what is going on. The expectation is that all the postgres processes would show up together, as a single line, with process group leader (2143, not captured in the snippet) displaying aggegated metrics. Generally the aggregation would be a sum (VIRT, RES, SHR, %CPU, %MEM, TIME+, SWAP, nFLT), but sometimes not (as for PR and NI, which might be shown as just --).
For processes whose PPID is 1, it would be nice to have an option of toggling between aggregating them all together, or of leaving them listed individually.
Aggegation by the name of the process (java vs. munin-node, vs. postmaster, vs. chrome) would also be a nice option. The COMMAND arguments would not be used when aggregating by command name.
This would be very valuable when tuning an application. How can I do this, aggregating top data for at-a-glance viewing in larger scale systems? Has anyone written an app, perhaps that uses top in batch mode, to create a summary view like I'm discussing?
FYI, I'm specifically interest in something for CentOS, but this would be helpful on any OS variant.
Thanks!
...Alan

Shell script ran on multiuser environment

I have a linux server and there contains an important script xyz.sh. At times there will be 10-50 users logged into that machine. Is it possible to find who is running the script? Also, is it possible to get a log who all have ran the script xyz.sh; means is it possible to extract a history of script run?
For a simple on-the-fly check about the owners' script while it's runnging you can use:
$ ps -e -o euid,pid,euser,state,command | grep "xyz.sh"|grep -v grep
0 31096 root S /bin/bash ./xyz.sh
1000 31030 ale S /bin/bash ./xyz.sh
It should be possible to log the ps output with this script:
#!/bin/bash
SECONDS=5
TARGET=xyz.sh
OUT=/var/tmp/xyz_history.log
while true
do
sleep $SECONDS
echo "$(date '+TIME:%H:%M:%S';ps -e -opid,user,command|grep $TARGET | grep -v grep)"
done >> $OUT
exit 0
The output:
$ tail -f /var/tmp/xyz_history.log
TIME:14:13:37
496 postgres /bin/bash ./xyz.sh
625 ale /bin/bash ./xyz.sh
32137 root /bin/bash ./xyz.sh
TIME:14:13:38
496 postgres /bin/bash ./xyz.sh
625 ale /bin/bash ./xyz.sh
32137 root /bin/bash ./xyz.sh
TIME:14:13:39
496 postgres /bin/bash ./xyz.sh
625 ale /bin/bash ./xyz.sh
TIME:14:13:40
496 postgres /bin/bash ./xyz.sh
625 ale /bin/bash ./xyz.sh
...
This is not a clean solution of course. If you can install packages on the system and run commands as superuser, a better solution is using lastcomm:
# lastcomm xyz.sh
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 14:12
xyz.sh X root pts/3 0.00 secs Fri Sep 11 14:00
xyz.sh X ale pts/4 0.00 secs Fri Sep 11 14:08
xyz.sh X ale pts/4 0.00 secs Fri Sep 11 14:00
xyz.sh X root pts/4 0.00 secs Fri Sep 11 13:54
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:51
xyz.sh X root pts/3 0.00 secs Fri Sep 11 13:42
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X postgres pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X root pts/1 0.00 secs Fri Sep 11 13:36
xyz.sh X ale pts/1 0.00 secs Fri Sep 11 13:36
It's possible install the command lastcomm from the psacct (centos/redhat) or acct package (debian/ubuntu/OpenSuse).

apache2 multiple instances under www-data

I am running apache2 on my RaspberryPi, mainly to interface with an mpd php client for streaming audio. After a month or so, I see the following:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1496 www-data 20 0 55900 17m 2112 S 0.0 9.3 0:30.32 apache2
7198 www-data 20 0 54868 15m 2188 S 0.0 8.4 0:10.57 apache2
7182 www-data 20 0 54868 15m 2168 S 0.0 8.3 0:11.67 apache2
1497 www-data 20 0 53844 15m 2132 S 0.0 8.2 0:07.58 apache2
2609 mysql 20 0 314m 15m 280 S 0.7 8.1 71:58.52 mysqld
7185 www-data 20 0 54868 14m 2180 S 0.0 8.1 0:08.71 apache2
7183 www-data 20 0 54868 14m 2120 S 0.0 8.1 0:14.36 apache2
1499 www-data 20 0 53844 14m 2144 S 0.0 8.0 0:07.73 apache2
1932 mpd 20 0 81204 8152 584 S 0.0 4.3 145:46.25 mpd
7211 www-data 20 0 45652 8004 2204 S 0.0 4.2 0:01.65 apache2
3318 www-data 20 0 45652 7944 2140 S 0.0 4.2 0:03.43 apache2
7210 www-data 20 0 45652 7784 2176 S 0.0 4.1 0:01.28 apache2
1965 root 20 0 44532 5268 216 S 0.0 2.8 1:53.06 apache2
7168 www-data 20 0 45652 7956 2140 S 0.0 4.2 0:02.42 apache2
Along with mpd and mysql, and the root apache2 process, 11 apache2 process running as www-data. On reboot, I see 5 apache2 processes under www-data.
Why are more processes spawnded, and not closed down? I continue to see this grow until there are 20+ processes, which slows down the something this small with limited resources.
Why are new processes spawned, then persist? Can I control this in conf.d (I have tried, but the feedback process takes a few days or week, so it's hard to tell).
Apache - when running in pre-fork mode - spawns a pool of worker processes in order to keep the response time low. Every worker will be responsible for a certain request. So, if there are 11 workers running, apache could serve 11 requests in "parallel" without spawning a new process (what would take a significant amount of time)
Apache spawns that workers / and keeps them alive intelligently, but you can set the maximum and minimum amount of workers in the apache2.conf

Resources