Trying to join output from ps and pwdx linux commands - linux

I am trying to join output from ps and pwdx command. Can anyone point out the mistake in my command.
ps -eo %p,%c,%u,%a --no-headers | awk -F',' '{ for(i=1;i<=NF;i++) {printf $i",
"} ; printf pwdx $1; printf "\n" }'
I expect the last column in each row to be the process directory. But it just shows the value of $1 instead of the command output pwdx $1
This is my output sample (1 row):
163957, processA , userA , /bin/processA -args, 163957
I expected
163957, processA , userA , /bin/processA -args, /app/processA
Can anyone point out what I may be missing

Try this:
ps -eo %p,%c,%u,%a --no-headers | awk -F',' '{ printf "%s,", $0; "pwdx " $1 | getline; print gensub("^[0-9]*: *","","1",$0);}'
Explanation:
awk '{print pwdx $1}' will concatenate the awk variable pwdx (which is empty) and $1 (pid). So, effectively, you were getting only the pid at the output.
In order to run a command and gets its output, you need to use this awk construct:
awk '{"some command" | getline; do_something_with $0}'
# After getline, the output will be present in $0.
#For multiline output, use this:
awk '{while ("some command" | getline){do_something_with $0}}'
# Each individual line will be present in subsequent run of the while loop.

Simplifying your example to focus on how to execute the pwdx command within awk and capture the result of this command into an awk variable as this is where you were having issues:
ps -eo %p,%c,%u,%a --no-headers | awk -F',' '{ system("pwdx "$1) | getline vpwdx; printf vpwdx $1}'
produces:
15651665: /
16651690: /
16901691: /home/fpm
169134248: /home/fpm
3424834254: /home/fpm/tmp
3425440181: /home/fpm/UDK2015
...

Related

awk - send sum to global variable

I have a line in a bash script that calculates the sum of unique IP requests to a certain page.
grep $YESTERDAY $ACCESSLOG | grep "$1" | awk -F" - " '{print $1}' | sort | uniq -c | awk '{sum += 1; print } END { print " ", sum, "total"}'
I am trying to get the value of sum to a variable outside the awk statement so I can compare pages to each other. So far I have tried various combinations of something like this:
unique_sum=0
grep $YESTERDAY $ACCESSLOG | grep "$1" | awk -F" - " '{print $1}' | sort | uniq -c | awk '{sum += 1; print ; $unique_sum=sum} END { print " ", sum, "total"}'
echo "${unique_sum}"
This results in an echo of "0". I've tried placing __$unique_sum=sum__ in the END, various combinations of initializing the variable (awk -v unique_sum=0 ...) and placing the variable assignment outside of the quoted sections.
So far, my Google-fu is failing horribly as most people just send the whole of the output to a variable. In this example, many lines are printed (one for each IP) in addition to the total. Failing a way to capture the 'sum' variable, is there a way to capture that last line of output?
This is probably one of the most sophisticated things I've tried in awk so my confidence that I've done anything useful is pretty low. Any help will be greatly appreciated!
You can't assign a shell variable inside an awk program. In general, no child process can alter the environment of its parent. You have to have the awk program print out the calculated value, and then shell can grab that value and assign it to a variable:
output=$( grep $YESTERDAY $ACCESSLOG | grep "$1" | awk -F" - " '{print $1}' | sort | uniq -c | awk '{sum += 1; print } END {print sum}' )
unique_sum=$( sed -n '$p' <<< "$output" ) # grab the last line of the output
sed '$d' <<< "$output" # print the output except for the last line
echo " $unique_sum total"
That pipeline can be simplified quite a lot: awk can do what grep can do, so first
grep $YESTERDAY $ACCESSLOG | grep "$1" | awk -F" - " '{print $1}'
is (longer, but only one process)
awk -F" - " -v date="$YESTERDAY" -v patt="$1" '$0 ~ date && $0 ~ patt {print $1}' "$ACCESSLOG"
And the last awk program just counts how many lines and can be replaced with wc -l
All together:
unique_output=$(
awk -F" - " -v date="$YESTERDAY" -v patt="$1" '
$0 ~ date && $0 ~ patt {print $1}
' "$ACCESSLOG" | sort | uniq -c
)
echo "$unique_output"
unique_sum=$( wc -l <<< "$unique_output" )
echo " $unique_sum total"

awk command not working as expected when bash variable is used inside

I tried to use bash variable inside awk by creating a variable in awk command as below. but it does not work it seems
b=hi
$ echo "hihello" |awk -v myvar=$b -F"$0~myvar" '{print $2}'
Actual Output is :
<empty / nothing printed >
Expected output is :
hello
Why don't you do this:
b=hi ; echo "hihello" | awk -F"$b" '{print $2}'
hello
Try the below awk command. Put the Field Separator inside BEGIN block.
$ b=hi; echo "hihello" | awk -v myvar=$b 'BEGIN{FS=myvar}{print $2}'
hello
It sets the value of myvar variable to the Field Separator. Thus inturn printing the second column will give you the string hello

cut command to delimit the output and add them

I am new in bash
I wrote a bash script and it gives me an output like this:
3387 /test/file1
23688 /test/file2
5813 /test/file3
10415 /test/file4
1304 /test/file5
46 /test/file6
8 /test/file7
138 /test/file8
I can delimit them by
wc -l /path/to/$dir/test | cut -d" " -f1
how can I add numbers to eachother and caculate them?
can I do:
output=`wc -l /path/to/$dir/test | cut -d" " -f1`
Is it possible to use "while" or "for" loop and add those numbers?
how?
thank you in advance
You want awk here to avoid explicit loops. If your output was in the file data.txt you could use:
$ awk '{sum += $1} END {print sum}' data.txt
44799
In your case, pipe the output of your script to awk:
$ your_script.sh | awk '{sum += $1} END {print sum}'
Since the output you gave in your question was the output of wc -l, try:
$ wc -l /path/to/$dir/test | awk '{sum += $1} END {print sum}'
(Aside for anyone else landing on this page: wc -l, when given wildcards, will also give you a total, but it's great to use awk in this case because you can deal directly with the total line count and pipe just that to another process.)

Linux awk command doesn't print integers correctly?

Can someone explain why this command doesn't print out a list of PID without the newline?
I want output like:
1234 5678 123 456
I tried all these, and none of them work
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ printf "%d ", $1 }'
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ printf "%s ", $1 }'
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ print $1 }' | tr '\n' ''
ps -eww --no-headers -o pid,args | grep 'usr' | awk '{ print $1 }' | tr -d '\n'
I just found out bash works fine, but not zsh in my case
zsh has a feature of letting the user know that the last output line was partial (i.e. there were no final newline). For more details on this you can look up PROMPT_CR, PROMPT_SP and PROMPT_EOL_MARK in man zshoptions.
You can add PROMPT_EOL_MARK='' to your ~/.zshrc to make the partial line indicator empty, but I would advise against it: now we know that it's just a feature, and sometimes we can notice a problem with our data if we leave it enabled. On a reasonably powerful terminal, the percent sign (the default when PROMPT_EOL_MARK is unset) is output bold and inverted, so it can't be confused with a piece of actual output.
Your command's output is a list of pids exactly as you desired. Adding a final newline makes it also look right with zsh:
ps -eww --no-headers -o pid,args | awk '/usr/ { printf "%d ", $1 } END {print""}'
(using also another answer's idea of getting rid of grep using the power of awk).
It does for me like this:
ps -eww --no-headers -o pid,args | awk '/usr/{printf "%d ",$1}'
I.e. awk can search for strings matching regular expressions, so you don't really need grep when using awk.

How do I count number of instances of an output in awk?

TL;DR
The idea is :
awk '{
IP[$1]++;
}
END {
for(var in IP)
print IP[var]
}
}' getline < sockstat | awk '{print $2 "#" $3}' | grep -v '^PROCESS#PID'
I want to count the number of instance of every block in the output from ->
sockstat | awk '{print $2 "#" $3}' | grep -v '^PROCESS#PID'
Which looks like:
ubuntu-geoip-pr#2382
chrome#2453
chrome#2453
chrome#2453
chrome#2453
chrome#2453
chrome#2453
chrome#2453
chrome#2453
rhythmbox#4759
rhythmbox#4759
rhythmbox#4759
Finally, I want to get the output as:
1
8
3
This corresponds to the number of occurrences of each of the items in the previous output.
Problem in full:
The sockstat command outputs the info for some networking stats for the localhost. I first print out a single key from the second and third columns from the output (PROCESS and PID, respectively), in the form PROCESS#PID. Then, I want to calculate the frequency of each unique key from that output. One way to do this is to use the awk getline structure, but that seems works for files, and I have not been able to make it pull input directly from the above command.
I do not want to use temporary files, as that takes away the elegance of the solution.
sockstat | awk '{print $2 "#" $3}' | grep -v '^PROCESS#PID' | sort | uniq -c | awk '{print $1}'
How about this?
sockstat | grep -v PROCESS | awk '{key=$2"#"$3; count[key]++} END {for ( key in count ) { print key" "count[key]; } }'
You could simplify your command:
sockstat | awk 'NR>1 { a[$2 "#" $3]++ } END { for (i in a) print a[i], i }'
If you just want the counts, simply edit the print statement:
sockstat | awk 'NR>1 { a[$2 "#" $3]++ } END { for (i in a) print a[i] }'

Resources