Bash: How to record highest memory/cpu consumption during execution of a bash script? - linux

I have a function in a bash script that executes a long process called runBatch. Basically runBatch takes a file as an argument and loads the contents into a db. (runBatch is just a wrapper function for a database command that loads the content of the file)
My function has a loop that looks something like the below, where I am currently recording start time and elapsed time for the process to variables.
for batchFile in `ls $batchFilesDir`
do
echo "Batch file is $batchFile"
START_TIME=$(($(date +%s%N)/1000000))
runBatch $batchFile
ELAPSED_TIME=$(($(($(date +%s%N)/1000000))-START_TIME))
IN_SECONDS=$(awk "BEGIN {printf \"%.2f\",${ELAPSED_TIME}/1000}")
done
Then I am writing some information on each batch (such as time, etc.) to a table in a html page I am generating.
How would I go about recording the highest memory/cpu usage while the runBatch is running, along with the time, etc?
Any help appreciated.
Edit: I managed to get this done. I added a wrapper script around this script that runs this script in the background. I pass it's PID with $! to another script in the wrapper script that monitors the processes CPU and Memory usage with top every second. I compile everything into a html page at the end when the PID is no longer alive. Cheers for the pointers.

You should be able to get the PID of the process using $!,
runBatch $batchFile &
myPID=$!
and then you can run a top -b -p $myPID to print out a ticking summary of CPU.

Memory:
cat /proc/meminfo
Next grep whatever you want,
Cpu, it is more complicated - /proc/stat expained
Average load:
cat /proc/loadavg
For timing "runBatch" use
time runBatch
like
time sleep 10

Once you've got the pid of your process (e.g. like answered here) you can use (with watch(1) & cat(1) or grep(1)) the proc(5) file system, e.g.
watch cat /proc/$myPID/stat
(or use /proc/$myPID/status or /proc/$myPID/statm, or /proc/$myPID/maps for the address space, etc...)
BTW, to run batch jobs you should consider batch (and you might look into crontab(5) to run things periodically)

Related

Linux - read or collect file content faster (e.g. cpu temp every sec.)

I'm working on a system on which ubuntu is running. I'm reading basic data like CPU frequency and temperature out of the thermal zones provided in /sys/class/thermal.
Unfortunately, I've got around 100 thermal_zones from which I need to read the data. I do it with:
for SENSOR_NODE in /sys/class/thermal/thermal_zone*; do printf "%s: %s\n" $(cat ${SENSOR_NODE}/type) $(cat ${SENSOR_NODE}/temp); done
To collect all data takes ~2.5-3 sec. which is way to long.
Since I want to collect the data every second my question is, if there is a way to "read" or "collect" the data faster?
Thank you in advance
There's only so much you can do while writing your code in shell, but let's start with the basics.
Command substitutions, $(...), are expensive: They require creating a FIFO, fork()ing a new subprocess, connecting the FIFO to that subprocess's stdout, reading from the FIFO and waiting for the commands running in that subshell to exit.
External commands, like cat, are expensive: They require linking and loading a separate executable; and when you run them without exec (in which case they inherit and consume the shell's process ID), they also require a new process to be fork()ed off.
All POSIX-compliant shells give you a read command:
for sensor_node in /sys/class/thermal/thermal_zone*; do
read -r sensor_type <"$sensor_node/type" || continue
read -r sensor_temp <"$sensor_node/temp" || continue
printf '%s: %s\n' "$sensor_type" "$sensor_temp"
done
...which lets you avoid the command substitution overhead and the overhead of cat. However, read reads content only one byte at a time; so while you're not paying that overhead, it's still relatively slow.
If you switch from /bin/sh to bash, you get a faster alternative:
for sensor_node in /sys/class/thermal/thermal_zone*; do
printf '%s: %s\n' "$(<"$sensor_node/type)" "$(<"$sensor_node/temp")"
done
...as $(<file) doesn't need to do the one-byte-at-a-time reads that read does. That's only faster for being bash, though; it doesn't mean it's actually fast. There's a reason modern production monitoring systems are typically written in Go or with a JavaScript runtime like Node.

How do I extract current process CPU usage by path/command and print it to the console

I'd like to get current process CPU/memory usage% by process name/path and print it to the console.
the command should output one number and not provide an ongoing data flow like 'ps'.
ps -p PID doesn't work as:
I don't have the process number (I do have process path)
It doesn't print the current measurement once to the console
So for example it should look something like:
$command -getCPU | grep procesPath
You actually do know the PID if you know the process path as it is formatted like this : /proc/<pid>.
You can calculate the CPU usage with this method. It involves several steps though.

Get the load, cpu usage and time of executing a bash script

I have a bash script that I plan to run every 5 or 15 mins using crontab based on the load it puts on server.
I can find time of running the script, but load, memory usage and CPU usage I am not sure how to find.
Can someone help me?
Also any suggestions of rough benchmark that will help me decide if the script puts too much load and should be run every 15 mins and not 5 mins.
Thanks in Advance!
You can use "top -b", top gives the CPU usage, memory usage etc,
Insert these lines in your script, this will process in background and will terminate the process as soon as your testing overs.
ssh server_name "nohup top -b -d 0.5 >> file_name &"
\top process will run in background because of &, -d 0.5 will give you the cpu status at every 0.5 secs, redirect the output in file_name for latter analysis.
for killing the process after your test, insert following in your script,
ssh server_name "kill \`ps -elf | grep 'top -b' | grep -v grep | sed 's/ */ /g' |cut -d ' ' -f4\`"
Your main testing script should be between top command and command for killing top.
I presumed you are running the script from client side, if not ignore "ssh server_name".
If you are running it from client side, because of "ssh", you will be asked for the password, for avoiding this follow these 3 simple steps
This will definitely solve the issue.
You can check following utilities
pidstat for CPU load, man page
pmap for memory load, man page
Although you might need to make measurements also for the child processes of your executable, in order to collect summarizing information
For memory, use free -m. Your actual memory available is the second number next to +/- buffers/cache (in megabytes with -m) (source).
For CPU, it's a bit more complicated. Start by looking at cat /proc/stat | grep 'cpu ' (note the space). You'll see something like this:
cpu 2255 34 2290 22625563 6290 127 456
The columns are from left to right, "user, nice, system, idle". CPU usage is usually calculated as (user+nice+system) / (user+nice+system+idle). However, these numbers show the number of "time units" that the CPU has spent doing that thing since boot, and thus are always increasing. If you were to do the aforementioned calculation, you'd get the CPU usage average since boot. To get a point-in-time usage, you have to take 2 samples, find their difference, and calculate the usage from that. To be clear, that will be the average CPU usage between your samples. (source)

Time taken by `less` command to show output

I have a script that produces a lot of output. The script pauses for a few seconds at point T.
Now I am using the less command to analyze the output of the script.
So I execute ./script | less. I leave it running for sufficient time so that the script would have finished executing.
Now I go through the output of the less command by pressing Pg Down key. Surprisingly while scrolling at the point T of the output I notice the pause of few seconds again.
The script does not expect any input and would have definitely completed by the time I start analyzing the output of less.
Can someone explain how the pause of few seconds is noticable in the output of less when the script would have finished executing?
Your script is communicating with less via a pipe. Pipe is an in-memory stream of bytes that connects two endpoints: your script and the less program, the former writing output to it, the latter reading from it.
As pipes are in-memory, it would be not pleasant if they grew arbitrarily large. So, by default, there's a limit of data that can be inside the pipe (written, but not yet read) at any given moment. By default it's 64k on Linux. If the pipe is full, and your script tries to write to it, the write blocks. So your script isn't actually working, it stopped at some point when doing a write() call.
How to overcome this? Adjusting defaults is a bad option; what is used instead is allocating a buffer in the reader, so that it reads into the buffer, freeing the pipe and thus letting the writing program work, but shows to you (or handles) only a part of the output. less has such a buffer, and, by default, expands it automatically, However, it doesn't fill it in the background, it only fills it as you read the input.
So what would solve your problem is reading the file until the end (like you would normally press G), and then going back to the beginning (like you would normally press g). The thing is that you may specify these commands via command line like this:
./script | less +Gg
You should note, however, that you will have to wait until the whole script's output loads into memory, so you won't be able to view it at once. less is insufficiently sophisticated for that. But if that's what you really need (browsing the beginning of the output while the ./script is still computing its end), you might want to use a temporary file:
./script >x & less x ; rm x
The pipe is full at the OS level, so script blocks until less consumes some of it.
Flow control. Your script is effectively being paused while less is paging.
If you want to make sure that your command completes before you use less interactively, invoke less as less +G and it will read to the end of the input, you can then return to the start by typing 1G into less.
For some background information there's also a nice article by Alexander Sandler called "How less processes its input"!
http://www.alexonlinux.com/how-less-processes-its-input
Can I externally enforce line buffering on the script?
Is there an off the shelf pseudo tty utility I could use?
You may try to use the script command to turn on line-buffering output mode.
script -q /dev/null ./script | less # FreeBSD, Mac OS X
script -c "./script" /dev/null | less # Linux
For more alternatives in this respect please see: Turn off buffering in pipe.

Get CPU usage in shell script?

I'm running some JMeter tests against a Java process to determine how responsive a web application is under load (500+ users). JMeter will give the response time for each web request, and I've written a script to ping the Tomcat Manager every X seconds which will get me the current size of the JVM heap.
I'd like to collect stats on the server of the % of CPU being used by Tomcat. I tried to do it in a shell script using ps like this:
PS_RESULTS=`ps -o pcpu,pmem,nlwp -p $PID`
...running the command every X seconds and appending the results to a text file. (for anyone wondering, pmem = % mem usage and nlwp is number of threads)
However I've found that this gives a different definition of "% of CPU Utilization" than I'd like - according to the manpages for ps, pcpu is defined as:
cpu utilization of the process in "##.#" format. It is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage.
In other words, pcpu gives me the % CPU utilization for the process for the lifetime of the process.
Since I want to take a sample every X seconds, I'd like to be collecting the CPU utilization of the process at the current time only - similar to what top would give me
(CPU utilization of the process since the last update).
How can I collect this from within a shell script?
Use top -b (and other switches if you want different outputs). It will just dump to stdout instead of jumping into a curses window.
The most useful tool I've found for monitoring a server while performing a test such as JMeter on it is dstat. It not only gives you a range of stats from the server, it outputs to csv for easy import into a spreadsheet and lets you extend the tool with modules written in Python.
User load: top -b -n 2 |grep Cpu |tail -n 1 |awk '{print $2}' |sed 's/.[^.]*$//'
System load: top -b -n 2 |grep Cpu |tail -n 1 |awk '{print $3}' |sed 's/.[^.]*$//'
Idle load: top -b -n 1 |grep Cpu |tail -n 1 |awk '{print $5}' |sed 's/.[^.]*$//'
Every outcome is a round decimal.
Off the top of my head, I'd use the /proc filesystem view of the system state - Look at man 5 proc to see the format of the entry for /proc/PID/stat, which contains total CPU usage information, and use /proc/stat to get global system information. To obtain "current time" usage, you probably really mean "CPU used in the last N seconds"; take two samples a short distance apart to see the current rate of CPU consumption. You can then munge these values into something useful. Really though, this is probably more a Perl/Ruby/Python job than a pure shell script.
You might be able to get the rough data you're after with /proc/PID/status, which gives a Sleep average for the process. Pretty coarse data though.
also use 1 as iteration count, so you will get current snapshot without waiting to get another one in $delay time.
top -b -n 1
This will not give you a per-process metric, but the Stress Terminal UI is super useful to know how badly you're punishing your boxes. Add -c flag to make it dump the data to a CSV file.

Resources