Linux awk command doesn't print integers correctly? - linux

Can someone explain why this command doesn't print out a list of PID without the newline?
I want output like:
1234 5678 123 456
I tried all these, and none of them work
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ printf "%d ", $1 }'
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ printf "%s ", $1 }'
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ print $1 }' | tr '\n' ''
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ print $1 }' | tr -d '\n'
I just found out bash works fine, but not zsh in my case

zsh has a feature of letting the user know that the last output line was partial (i.e. there were no final newline). For more details on this you can look up PROMPT_CR, PROMPT_SP and PROMPT_EOL_MARK in man zshoptions.
You can add PROMPT_EOL_MARK='' to your ~/.zshrc to make the partial line indicator empty, but I would advise against it: now we know that it's just a feature, and sometimes we can notice a problem with our data if we leave it enabled. On a reasonably powerful terminal, the percent sign (the default when PROMPT_EOL_MARK is unset) is output bold and inverted, so it can't be confused with a piece of actual output.
Your command's output is a list of pids exactly as you desired. Adding a final newline makes it also look right with zsh:
ps -eww --no-headers -o pid,args | awk '/usr/ { printf "%d ", $1 } END {print""}'
(using also another answer's idea of getting rid of grep using the power of awk).

It does for me like this:
ps -eww --no-headers -o pid,args | awk '/usr/{printf "%d ",$1}'
I.e. awk can search for strings matching regular expressions, so you don't really need grep when using awk.

Related

Why is 'sh -c' returning a different result from my actual command?

I'm trying to get the number of pages in a PDF file through the command line.
pdfinfo "/tmp/temp.pdf" | grep Pages: | awk '{print $2}'
produces
3
In Node.js, I need to use 'sh' because of the piping.
But
sh -c "pdfinfo '/tmp/temp.pdf' | grep Pages: | awk '{print $2}'"
produces
Pages: 3
Why am I getting different output?
The sh -c command is double quoted, which will expand the $2 which is likely empty, so the awk command becomes just print which prints the whole line. You could escape the $ to prevent it from being expanded:
sh -c "pdfinfo '/tmp/temp.pdf' | grep Pages: | awk '{print \$2}'"
Incidentally, awk can do pattern matching, so no need for both grep and awk:
pdfinfo "/tmp/temp.pdf" | awk '/Pages:/ {print $2}'

Extract last digits from each word in a string with multiple words using bash

Given a string with multiple words like below, all in one line:
first-second-third-201805241346 first-second-third-201805241348 first-second-third-201805241548 first-second-third-201705241540
I am trying to the maximum number from the string, in this case the answer should be 201805241548
I have tried using awk and grep, but I am only getting the answer as last word in the string.
I am interested in how to get this accomplished.
echo 'first-second-third-201805241346 first-second-third-201805241348 first-second-third-201805241548 first-second-third-201705241540' |\
grep -o '[0-9]\+' | sort -n | tail -1
The relevant part is grep -o '[0-9]\+' | sort -n | tail -n 1.
Using single gnu awk command:
s='first-second-third-201805241346 first-second-third-201805241348 first-second-third-201805241548 first-second-third-201705241540'
awk -F- -v RS='[[:blank:]]+' '$NF>max{max=$NF} END{print max}' <<< "$s"
201805241548
Or using grep + awk (if gnu awk is not available):
grep -Eo '[0-9]+' <<< "$s" | awk '$1>max{max=$1} END{print max}'
Another awk
echo 'first-...-201705241540' | awk -v RS='[^0-9]+' '$0>max{max=$0} END{print max}'
Gnarly pure bash:
n='first-second-third-201805241346 \
first-second-third-201805241348 \
first-second-third-201805241548 \
first-second-third-201705241540'
z="${n//+([a-z-])/;p=}"
p=0 m=0 eval echo -n "${z//\;/\;m=\$((m>p?m:p))\;};m=\$((m>p?m:p))"
echo $m
Output:
201805241548
How it works: This code constructs code, then runs it.
z="${n//+([a-z-])/;p=}" substitutes non-numbers with some pre-code
-- setting $p to the value of each number, (useless on its own). At this point echo $z would output:
;p=201805241346 \ ;p=201805241348 \ ;p=201805241548 \ ;p=201705241540
Substitute the added ;s for more code that sets $m to the
greatest value of $p, which needs eval to run it -- the actual
code the whole line with eval runs looks like this:
p=0 m=0
m=$((m>p?m:p));p=201805241346
m=$((m>p?m:p));p=201805241348
m=$((m>p?m:p));p=201805241548
m=$((m>p?m:p));p=201705241540
m=$((m>p?m:p))
Print $m.

Trying to join output from ps and pwdx linux commands

I am trying to join output from ps and pwdx command. Can anyone point out the mistake in my command.
ps -eo %p,%c,%u,%a --no-headers | awk -F',' '{ for(i=1;i<=NF;i++) {printf $i",
"} ; printf pwdx $1; printf "\n" }'
I expect the last column in each row to be the process directory. But it just shows the value of $1 instead of the command output pwdx $1
This is my output sample (1 row):
163957, processA , userA , /bin/processA -args, 163957
I expected
163957, processA , userA , /bin/processA -args, /app/processA
Can anyone point out what I may be missing
Try this:
ps -eo %p,%c,%u,%a --no-headers | awk -F',' '{ printf "%s,", $0; "pwdx " $1 | getline; print gensub("^[0-9]*: *","","1",$0);}'
Explanation:
awk '{print pwdx $1}' will concatenate the awk variable pwdx (which is empty) and $1 (pid). So, effectively, you were getting only the pid at the output.
In order to run a command and gets its output, you need to use this awk construct:
awk '{"some command" | getline; do_something_with $0}'
# After getline, the output will be present in $0.
#For multiline output, use this:
awk '{while ("some command" | getline){do_something_with $0}}'
# Each individual line will be present in subsequent run of the while loop.
Simplifying your example to focus on how to execute the pwdx command within awk and capture the result of this command into an awk variable as this is where you were having issues:
ps -eo %p,%c,%u,%a --no-headers | awk -F',' '{ system("pwdx "$1) | getline vpwdx; printf vpwdx $1}'
produces:
15651665: /
16651690: /
16901691: /home/fpm
169134248: /home/fpm
3424834254: /home/fpm/tmp
3425440181: /home/fpm/UDK2015
...

What does this command does in shell linux

TMPFILE=/tmp/jboss_ps.$$
${PS} ${PS_OPTS} | \
grep ${JBOSS_HOME}/java | \
egrep -v " grep | \
tee | $0 " | ${AWK} '{print $NF " "}' | \
sort -u > ${TMPFILE} 2>/dev/null
I want to know what this precise line is doing from the code above
egrep -v " grep | \
tee | $0 "
At first i thought that that line is searching for everything that does not contain this exact string "grep | \ tee | $0" but it appears that egrep is processing the pipes, so what's the significance of the pipes here, does it mean OR ? From my test it appears that it's not, but if it means output redirection then what's the inner grep getting ? And why is tee alone too ?
AFAIK
egrep -v " grep | \
tee | $0 "
is nothing but
egrep -v " grep | tee | $0 "
where \ is the continuation character in bash.
egrep is same as grep -E
-v for inverted selection
tee just another string
so egrep -v " grep | tee | $0 " does find lines that have the string {java path} and within this results, all the lines that doesn't match the condition {either of grep OR tee OR $0 } where
$0 is the filename not a '$0' because it uses DOUBLE QUOTES and not single quotes :)
" commands | $variables " has the tendency to expand the variables and use the utility.
The commands in the pipeline before the egrep command is probably something like
ps -ef|grep .... The egrep -v (Option)line you asked about is simply omitting lines you
don't want in the results, in this case the initial grep command issued by the
script, any tee commands and lastly $0 which is the name of the this script
being executed. egrep allows to enter multiple patterns enclosed in double quotes and
separated by pipe symbol. Syntax egrep -[option or not] "patern1|patern2|patern..."

How to make GREP select only numeric values?

I use the df command in a bash script:
df . -B MB | tail -1 | awk {'print $4'} | grep .[0-9]*
This script returns:
99%
But I need only numbers (to make the next comparison).
If I use the grep regex without the dot:
df . -B MB | tail -1 | awk {'print $4'} | grep .[0-9]*
I receive nothing.
How to fix?
If you try:
echo "99%" |grep -o '[0-9]*'
It returns:
99
Here's the details on the -o (or --only-matching flag) works from the grep manual page.
Print only the matched (non-empty) parts of matching lines, with each such part on a separate output line. Output lines use the same delimiters as input, and delimiters are null bytes if -z (--null-data) is also used (see Other Options).
grep will print any lines matching the pattern you provide. If you only want to print the part of the line that matches the pattern, you can pass the -o option:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Like this:
echo 'Here is a line mentioning 99% somewhere' | grep -o '[0-9]+'
How about:
df . -B MB | tail -1 | awk {'print $4'} | cut -d'%' -f1
No need to used grep here, Try this:
df . -B MB | tail -1 | awk {'print substr($5, 1, length($5)-1)'}
function getPercentUsed() {
$sys = system("df -h /dev/sda6 --output=pcent | grep -o '[0-9]*'", $val);
return $val[0];
}
Don't use more commands than necessary, leave away tail, grep and cut. You can do this with only (a simple) awk
PS: giving a block-size en print only de persentage is a bit silly ;-) So leave also away the "-B MB"
df . |awk -F'[multiple field seperators]' '$NF=="Last field must be exactly --> mounted patition" {print $(NF-number from last field)}'
in your case, use:
df . |awk -F'[ %]' '$NF=="/" {print $(NF-2)}'
output: 81
If you want to show the percent symbol, you can leave the -F'[ %]' away and your print field will move 1 field further back
df . |awk '$NF=="/" {print $(NF-1)}'
output: 81%
You can use Perl style regular expressions as well. A digit is just \d then.
grep -Po "\\d+" filename
-P Interpret PATTERNS as Perl-compatible regular expressions (PCREs).
-o Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

Resources