Why does my process counting script give false positives? - linux

I have the following bash script, that lists the current number of httpd processes, and if it is over 60, it should email me. This works 80% of the time but for some reason sometimes it emails me anyways when it is not over 60. Any ideas?
#!/bin/bash
lines=`ps -ef|grep httpd| wc -l`
if [ "$lines" -gt "60" ]
then
mailx -s "Over 60 httpd processes" me#me.com < /dev/null
fi

There is a delay between checking and emailing. In that time, some httpd processes might finish, or start, or both. So, the number of processes can be different.
You are including the grep process in the processes (most of the time, it could happen that the ps finishes before grep starts). An easy way to avoid that is to change your command to ps -ef | grep [h]ttpd. This will make sure that grep doesn't match grep [h]ttpd.
On linux, you have pgrep, which might be better suited for your purposes.
grep ... | wc -l can usually be replaced with grep -c ....
If you want to limit the number of httpd requests, I am sure you can set it in apache configuration files.

You've probably thought of this, but ...
At time t0, there are 61.
At time t1, when you read the email, there are 58.
Try including the value of $lines in the email and you'll see.
Or try using /proc/*/cmdline, it might be more reliable.

grep httpd finds all processes that include httpd in their name, including possibly grep httpd itself, and perhaps other ones.

"ps -ef|grep httpd" doesn't find just httpd processes, does it? It finds processes whose full (-f) listing in ps includes the string "httpd".

This probably doesn't solve your issue but you could simplify things by using pgrep instead.

you can do it this way too, reducing the use of grep and wc to just one awk.
ps -eo args|awk '!/awk/&&/httpd/{++c}
END{
if (c>60){
cmd="mailx -s \047Over 60\047 root"
cmd | getline
}
}'

Related

How to trace the list of PIDs running on a specific core?

I'm trying to run a program on a dedicated core in Linux. (I know Jailhouse is a good way to do so, but I have to use off-the-shelf Linux. :-( )
Other processes, such as interrupt handlers, kernel threads, service progresses, may also run on the dedicated core occasionally. I want to disable as many such processes as possible. To do that, I need first pin down the list of processes that may run on the dedicated core.
My question is:
Is there any existing tools that I can use to trace the list of PIDs or processes that run on a specific core over a time interval?
Thank you very much for your time and help in this question!
TL;DR Dirty hacky solution.
DISCLAIMER: At some point stops working "column: line too long" :-/
Copy this to: core-pids.sh
#!/bin/bash
TARGET_CPU=0
touch lastPIDs
touch CPU_PIDs
while true; do
ps ax -o cpuid,pid | tail -n +2 | sort | xargs -n 2 | grep -E "^$TARGET_CPU" | awk '{print $2}' > lastPIDs
for i in {1..100}; do printf "#\n" >> lastPIDs; done
cp CPU_PIDs aux
paste lastPIDs aux > CPU_PIDs
column -t CPU_PIDs > CPU_PIDs.humanfriendly.tsv
sleep 1
done
Then
chmod +x core-pids.sh
./core-pids.sh
Then open CPU_PIDs.humanfriendly.tsv with your favorite editor, and ¡inspect!
The key is in the "ps -o cpuid,pid" bit, for more detailed info, please comment. :D
Explanation
Infinite loop with
ps -o cpuid,pid | tail -n +2 | sort | xargs -n 2 | grep -E "^$TARGET_CPU" | awk '{print $2}' > lastPIDs
ps ax -o cpuid,pid
Show pid's associated to CPU
tail -n +2
remove headers
sort
sort by cpuid
xargs -n 2
remove white spaces at begging
grep -E "^$TARGET_CPU"
filter by CPU id
awk '{print $2}'
get pid column
> lastPIDs
output to file those las pid's for the target CPU id
for i in {1..10}; do printf "#\n" >> lastPIDs; done
hack for pretty .tsv print with the "columns -t" command
cp CPU_PIDs aux
CPU_PIDs holds the whole timeline, we copy it to aux file to allow the next command to use it as input and output
paste lastPIDs aux > CPU_PIDs
Append lastPIDs columns to the whole timeline file CPU_PIDs
column -t CPU_PIDs > CPU_PIDs.humanfriendly.tsv
pretty print whole timeline CPU_PIDs file
Attribution
stackoverflow answer to: ps utility in linux (procps), how to check which CPU is used
by Mikel
stackoverflow answer to: Echo newline in Bash prints literal \n
by sth
stackoverflow answer to: shell variable in a grep regex
by David W.
superuser answer to: Aligning columns in output from a UNIX command
Janne Pikkarainen
nixCraft article: HowTo: Unix For Loop 1 to 100 Numbers
The best way to obtain what you want is to operate as follows:
Use the isolcpus= Linux kernel boot parameter to "free" one core from the Linux scheduler
Disable the irqbalance daemon (in case it is executing)
Set the IRQs affinities to the other cores by manually writing the CPU mask on /proc/irq/<irq_number>/smp_affinity
Finally, run your program setting the affinity to the dedicated core through the taskset command.
In this case, such core will only execute your program. For checking, you can type ps -eLF and look at the PSR column (which specifies the CPU number).
Not a direct answer to the question, but I am usually using perf context-switches software event to identify the perturbation of the system or other processes on my benchmarks

How do I tail top -c?

I have a couple ruby scripts running on my machine and some other ruby processes. The only way I can differentiate them with top is by doing top -c (so I can see the command, otherwise everything is just 'ruby').
I want to be able to watch how many scripts are running so I can restart them if one fails.
I am thinking I can do this with top -c -n 1 | grep "script-name" but I can't figure out how to tail -f that or if that command is the best way to do it in the first place.
I think that top it's not the best choice here, because it's an interactive command and you can't really pipe its whole output (probably there is a way). One of the fair enough ways to do it would be using ps:
ps -e -o pid,cmd | grep "script-name"
If you want to periodically investigate this, you can also use watch:
watch 'ps -e -o pid,cmd | grep "script-name"'
In general, it's a bad practice to grep ps output, but I suppose in your case will work. If you only want the number of running processes that match against a pattern or you just want their PIDs, you'd better go with pgrep.
pgrep "script-name"

Why `read -t` is not timing out in bash on RHEL?

Why read -t doesn't time out when reading from pipe on RHEL5 or RHEL6?
Here is my example which doesn't timeout on my RHEL boxes wile reading from the pipe:
tail -f logfile.log | grep 'something' | read -t 3 variable
If I'm correct read -t 3 should timeout after 3 seconds?
Many thanks in advance.
Chris
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
The solution given by chepner should work.
An explanation why your version doesn't is simple: When you construct a pipe like yours, the data flows through the pipe from the left to the right. When your read times out however, the programs on the left side will keep running until they notice that the pipe is broken, and that happens only when they try to write to the pipe.
A simple example is this:
cat | sleep 5
After five seconds the pipe will be broken because sleep will exit, but cat will nevertheless keep running until you press return.
In your case that means, until grep produces a result, your command will keep running despite the timeout.
While not a direct answer to your specific question, you will need to run something like
read -t 3 variable < <( tail -f logfile.log | grep "something" )
in order for the newly set value of variable to be visible after the pipeline completes. See if this times out as expected.
Since you are simply using read as a way of exiting the pipeline after a fixed amount of time, you don't have to worry about the scope of variable. However, grep may find a match without printing it within your timeout due to its own internal buffering. You can disable that (with GNU grep, at least), using the --line-buffered option:
tail -f logfile.log | grep --line-buffered "something" | read -t 3
Another option, if available, is the timeout command as a replacement for the read:
timeout 3 tail -f logfile.log | grep -q --line-buffered "something"
Here, we kill tail after 3 seconds, and use the exit status of grep in the usual way.
I dont have a RHEL server to test your script right now but I could bet than read is exiting on timeout and working as it should. Try run:
grep 'something' | strace bash -c "read -t 3 variable"
and you can confirm that.

Get pid of last started instance of a certain process

I have several instances of a certain process running and I want to determine the process id of the one that has been started last.
So far I came to this code:
ps -aef | grep myProcess | grep -v grep | awk -F" " '{print $2}' |
while read line; do
echo $line
done
This gets me all process ids of myProcess. Somehow I need to compare now the running times of this pids and find out the one with the smallest running time. But I don't know how to do that...
An easier way would be to use pgrep with its -n, --newest switch.
Select only the newest (most recently started) of the matching
processes.
Alternatively, if you don't want to use pgrep, you can use ps and sort by start time:
ps -ef kbsdstart
Use pgrep. It has a -n (newest) option for that. So just try
pgrep -n myProcess

Linux / Bash, using ps -o to get process by specific name?

I am trying to use the ps -o command to get just specific info about processes matching a certain name. However, I am having some issues on this, when I try to use this even to just get all processes, like so, it just returns a subset of what a normal ps -ef would return (it doesn't return nearly the same number of results so its not returning all running processes)
ps -ef -o pid,time,comm
I want to try something like this (below) but incorporate the ps -o to just get specific info from it (just the PID)
ps -ef |grep `whoami`| grep firefox-bin
Any advice is appreciated as to how to do this properly, thanks
This will get you the PID of a process by name:
pidof name
Which you can then plug back in to ps for more detail:
ps -p $(pidof name)
This is a bit old, but I guess what you want is: ps -o pid -C PROCESS_NAME, for example:
ps -o pid -C bash
EDIT: Dependening on the sort of output you expect, pgrep would be more elegant. This, in my knowledge, is Linux specific and result in similar output as above. For example:
pgrep bash
ps -fC PROCESSNAME
ps and grep is a dangerous combination -- grep tries to match everything on each line (thus the all too common: grep -v grep hack). ps -C doesn't use grep, it uses the process table for an exact match. Thus, you'll get an accurate list with: ps -fC sh rather finding every process with sh somewhere on the line.
Sometimes you need to grep the process by name - in that case:
ps aux | grep simple-scan
Example output:
simple-scan 1090 0.0 0.1 4248 1432 ? S Jun11 0:00
Sorry, much late to the party, but I'll add here that if you wanted to capture processes with names identical to your search string, you could do
pgrep -x PROCESS_NAME
-x Require an exact match of the process name, or argument list if -f is given.
The default is to match any substring.
This is extremely useful if your original process created child processes (possibly zombie when you query) which prefix the original process' name in their own name and you are trying to exclude them from your results. There are many UNIX daemons which do this. My go-to example is ninja-dev-sync.

Resources