awk transpose specific row to column and group them

awk transpose specific row to column and group them - linux

I have the below data in a stdout:
09:13:32 19.2 cpu(1)
09:13:32 15.6 cpu(2)
09:13:32 16.7 cpu(3)
09:13:32 17.1 cpu(6)
09:13:32 17.1 cpu(7)
09:13:32 16.9 cpu(8)
09:13:32 16.7 cpu(9)
09:13:39 13.0 cpu(1)
09:13:39 9.2 cpu(2)
09:13:39 9.1 cpu(3)
09:13:39 7.1 cpu(6)
09:13:39 27.1 cpu(7)
09:13:39 46.9 cpu(8)
09:13:39 36.7 cpu(9)
Trying to convert this to something like below.
['Time', 'cpu(1)', 'cpu(2)', 'cpu(3)', 'cpu(6)', 'cpu(7)', 'cpu(8)', 'cpu(9)'],
['09:13:32', 19.2, 15.6, 16.7, 17.1, 17.1, 16.9, 16.7],
['09:13:39', 13.0, 9.2, 9.1, 7.1, 27.1, 46.9, 36.7]
In other words, I need the original data to be aligned with Google visualization line chart format as stated here: https://developers.google.com/chart/interactive/docs/gallery/linechart
I am trying to achieve this using awk and need some inputs.
awk '{ for(N=1; N<=NF; N+=2) print $N, $(N+1); }' | awk 'BEGIN{q="\047"; printf "["q"Time"q","q"CPU"q"],"}/master/{q="\047"; printf "["q$10q"," $3"],"}' | sed 's/,$//'"
Note: I can change the original data columns like below
Time CPU(%) CPU Number
OR
CPU(%) CPU Number Time

Using any awk for any number of times and any number of cpus, and will work even if you don't have data for some time+cpu combinations, as long as the input isn't so massive that it can't all fit in memory:
$ cat tst.awk
BEGIN {
OFS = ", "
}
!seenTimes[$1]++ {
times[++numTimes] = $1
}
!seenCpus[$3]++ {
cpus[++numCpus] = $3
}
{
vals[$1,$3] = $2
}
END {
printf "[\047%s\047%s", "Time", OFS
for ( cpuNr=1; cpuNr<=numCpus; cpuNr++ ) {
cpu = cpus[cpuNr]
printf "\047%s\047%s", cpu, (cpuNr<numCpus ? OFS : "]")
}
for ( timeNr=1; timeNr<=numTimes; timeNr++ ) {
time = times[timeNr]
printf ",%s[\047%s\047%s", ORS, time, OFS
for ( cpuNr=1; cpuNr<=numCpus; cpuNr++ ) {
cpu = cpus[cpuNr]
val = vals[time,cpu]
printf "%s%s", val, (cpuNr<numCpus ? OFS : "]")
}
}
print ""
}
$ awk -f tst.awk file
['Time', 'cpu(1)', 'cpu(2)', 'cpu(3)', 'cpu(6)', 'cpu(7)', 'cpu(8)', 'cpu(9)'],
['09:13:32', 19.2, 15.6, 16.7, 17.1, 17.1, 16.9, 16.7],
['09:13:39', 13.0, 9.2, 9.1, 7.1, 27.1, 46.9, 36.7]

Related

How to make this awk script simple and use in gnuscript in loop form

I am attempting to plot a multicolum file using gnuplot script.
I am doing it like
plot "100.dat" u ($1-CONS):($2*$3) w l lt 4 ,
"200.dat" u ($1-CONS):($2*$3) w l lt 2 ,
"300.dat" u ($1-CONS):($2*$3) w l lt 1
where CONS is my variable defined at the top of file.
My set xrange is [-0.2:0.2] while data in the scale is beyond this scale.
What I want to capture is (in loop form for multiple files):
maximum value of above three plots in negative and positive both sides and corresponding value of column 1 in my xrange for both the maximum.
in a shell script I can do it easily but I am facing problem in defining in my gnuscript
my shell script is below
for i in 100.0000 200.0000 200.0000
do
grep $i data.dat > $i.dat
awk '{print ($1-CONS), ($2*$3)}' $i.dat | awk '{ if($1 <= 0.2 && $1 >= 0.0) { print }}' > $i.p2.dat ; awk 'BEGIN {min=1000000; max=0;}; { if($2<min && $2 != "") min = $2; if($2>max && $2 != "") max = $2; } END {print min, max}' $i.p2.dat | awk '{print $2}' > $i.p2Max.dat ; PMAX=$(cat $i.p2Max.dat) ; grep "$PMAX" $i.p2.dat | tail -n 1 >> MAX.dat
awk '{print ($1-CONS), ($2*$3)}' $i.dat | awk '{ if($1 <= 0.0 && $1 >= -0.2) { print }}' > $i.mi.dat ; awk 'BEGIN {min=1000000; max=0;}; { if($2<min && $2 != "") min = $2; if($2>max && $2 != "") max = $2; } END {print min, max}' $i.mi.dat | awk '{print $2}' > $i.mi_Max.dat ; N_MAX=$(cat $i.mi_Max.dat) ; grep "$N_MAX" $i.mi.dat | tail -n 1 >> MAX.dat
done
I am looking for a simple script that can be used in the gnuplot script in loop form so that if I have multiple data file and I need to grep the maximum of a colum two (on both the sides of the zero) then it store the maximum value of column two wrt corresponding value of column 1 separately for negative and positive scale.
I would love to see if this can be done using a loop so that I do not need to write all the lines repetitively.

Your description is a bit confusing to me. My understanding is the following: loop through several files and extract the maxima in the xranges [-0.2:0] and [0:0.2],
respectively.
Test data:
100.dat
-0.17 0.447 0.287
-0.13 0.353 0.936
-0.09 0.476 0.309
-0.05 0.504 0.220
-0.01 0.340 0.564
0.03 0.096 0.947
0.07 0.564 0.885
0.11 0.312 0.957
0.15 0.058 0.347
0.19 0.016 0.923
0.23 0.835 0.461
200.dat
-0.17 0.608 0.875
-0.13 0.266 0.805
-0.09 0.948 0.696
-0.05 0.513 0.800
-0.01 0.736 0.392
0.03 0.318 0.312
0.07 0.708 0.534
0.11 0.246 0.975
0.15 0.198 0.914
0.19 0.174 0.318
0.23 0.727 0.341
300.dat
-0.17 0.527 0.658
-0.13 0.166 0.340
-0.09 0.695 0.031
-0.05 0.623 0.542
-0.01 0.996 0.674
0.03 0.816 0.365
0.07 0.286 0.433
0.11 0.069 0.381
0.15 0.719 0.621
0.19 0.516 0.701
0.23 0.248 0.659
Code:
### loop of files and extracting values
reset session
FILES = "100.dat 200.dat 300.dat"
Count = words(FILES)
CONS = 0.03
# get maxima
array NegMaxX[Count]
array NegMaxY[Count]
array PosMaxX[Count]
array PosMaxY[Count]
do for [i=1:Count] {
stats [-0.2:0] word(FILES,i) u ($1-CONS):($2*$3) nooutput
NegMaxX[i] = STATS_pos_max_y
NegMaxY[i] = STATS_max_y
stats [0:0.2] word(FILES,i) u ($1-CONS):($2*$3) nooutput
PosMaxX[i] = STATS_pos_max_y
PosMaxY[i] = STATS_max_y
}
set xrange[-0.2:0.2]
# set labels
do for [i=1:Count] {
set label i*2-1 at NegMaxX[i], NegMaxY[i] sprintf("%.3f/%.3f",NegMaxX[i],NegMaxY[i])
set label i*2 at PosMaxX[i], PosMaxY[i] sprintf("%.3f/%.3f",PosMaxX[i],PosMaxY[i])
}
plot for [i=1:Count] word(FILES,i) u ($1-CONS):($2*$3) w l lt i ti word(FILES,i), \
### end of code
Result:

Recover informations from CSV files with my awk script

I have this CSV files :
Monday,linux,6,0.2
Tuesday,linux,0.25,0.2
Wednesday,linux,64,3
I create a little script that allow me to recover the informations from my csv
and to place them like this :
Day : Monday
OS : Linux
RAM : 6
CPU1 : 0.2
My script is :
#!/bin/bash
awk -F'[ ,;|.]' 'FNR==0{next}
FNR>1 {
print "DAY : " $1;
print "OS :\n " $2
print "RAM :\n " $3
print "CPU1 :\n " $4
}' mycsvfile.csv
But the result is :
DAY : Tuesday
OS :
linux
RAM :
0
CPU1 :
25
DAY : Wednesday
OS :
linux
RAM :
64
CPU1
Or I want :
DAY : Monday
OS : linux
RAM : 0.2
CPU 1 : 1
DAY : Tuesday
OS : linux
RAM : 0.25
CPU 1 : 0.2
DAY : Wednesday
OS : linux
RAM : 64
CPU 1 : 3
Can you tell me why my script doesn't works and why floats are not taken into account ?
Thank you !

Added tab and newline to same awk as Cyrus posted.
awk -F ',' '{
print "DAY :",$1
print "OS :",$2
print "RAM :",$3
print "CPU1 :",$4"\n"
}' OFS='\t' file
DAY : Monday
OS : linux
RAM : 6
CPU1 : 0.2
DAY : Tuesday
OS : linux
RAM : 0.25
CPU1 : 0.2
DAY : Wednesday
OS : linux
RAM : 64
CPU1 : 3
A more generic solution:
awk -F, 'BEGIN {split("DAY OS RAM CPU", header, " ")}{for (i=1;i<=4;i++) print header[i]":\t",$i;print ""}' t
DAY: Monday
OS: linux
RAM: 6
CPU: 0.2
DAY: Tuesday
OS: linux
RAM: 0.25
CPU: 0.2
DAY: Wednesday
OS: linux
RAM: 64
CPU: 3
More readable:
awk -F, '
BEGIN {split("DAY OS RAM CPU", header, " ")}
{
for (i=1;i<=4;i++)
print header[i]":\t",$i;
print ""
}' file

Extract average time using fping

I want to extract the avg time using fping.
fping -q -b 12 -c 3 localhost 192.168.0.20 192.168.0.1 192.168.0.18 192.168.0.22
localhost : xmt/rcv/%loss = 3/3/0%, min/avg/max =
0.06/0.07/0.09
192.168.0.20 : xmt/rcv/%loss = 3/0/100%
192.168.0.1 : xmt/rcv/%loss = 3/3/0%, min/avg/max = 2.00/2.57/3.11
192.168.0.18 : xmt/rcv/%loss = 3/0/100%
192.168.0.22 : xmt/rcv/%loss = 3/3/0%, min/avg/max = 0.12/0.16/0.19
The average output should be of every device(-1 if device is unreachable), for example.
0.07
-1
2.57
-1
0.16
Thanks

Using awk:
fping -b 12 -c 3 localhost 192.168.0.20 192.168.0.1 192.168.0.18 192.168.0.22 |
awk -F'/' '{print ($8?$8:"-1")}'
0.07
-1
2.57
-1
0.16
Given the / as field delimiter, print the 8th field if it exists otherwise print the string -1

$ ... | awk -F/ '{print (/avg/?$(NF-1):-1)}'
search for "avg" keyword, if found print penultimate field, otherwise -1.

Create udev rules file from two input file

I am looking for a solution to create Oracle ASM udev rules file for linux. I have two input file. file1 has info of ASM disk requirement and file2 has disk information.
For example, line 2 of file1 is showing DATA12 need 3 disk(DATA12_01,DATA12_02,DATA12_03) of each 128G. file2 has all disk info with size. From these two input file I need to create output file shown bellow.
cat file1
Count - size - name
3 - 128 GB DATA12
1 - 128 GB TEMP02
2 - 4 GB ARCH03
2 - 1 GB ARCH04
1 - 3 GB ORAC01
cat file2
UUID Size
360060e80166ef70000016ef700006700 128.00 GiB
360060e80166ef70000016ef700006701 128.00 GiB
360060e80166ef70000016ef700006702 128.00 GiB
360060e80166ef70000016ef700006703 128.00 GiB
360060e80166ef70000016ef700006730 4.00 GiB
360060e80166ef70000016ef700006731 4.00 GiB
360060e80166ef70000016ef700006733 1.00 GiB
360060e80166ef70000016ef700006734 1.00 GiB
360060e80166ef70000016ef700006735 3.00 GiB
Output File
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006700", SYMLINK+="udevlinks/DATA12_01"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006701", SYMLINK+="udevlinks/DATA12_02"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006702", SYMLINK+="udevlinks/DATA12_03"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006703", SYMLINK+="udevlinks/TEMP02_01"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006730", SYMLINK+="udevlinks/ARCH03_01"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006731", SYMLINK+="udevlinks/ARCH03_02"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006733", SYMLINK+="udevlinks/ARCH04_01"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006734", SYMLINK+="udevlinks/ARCH04_02"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006735", SYMLINK+="udevlinks/ORAC01_01"

Here is one in AWK:
$ cat > test.awk
BEGIN {FS="([.]| +)"} # field separator do deal with "." in file2 128.00
FNR==1 {next} # skip header
NR==FNR { # read available disks to pool from file1
for(i=1; i<=$1; i++)
a[$5"_"0i]=$3 # name and set the disks into pool
next}
{
for(i in a) { # look for right sized disk
if(a[i]==$2) { # when found, print...
printf "%s%s%s%s%s", "ACTION==\"add|change\", ENV{DM_NAME}==\"",$1,"\",\"SYMLINK+=\"udevlinks/",i,"\"\n"
delete a[i] # ... and remove from pool
break
}
} # if no device was found:
old=len; len=length(a); if(old==len) {print "No device found for ",$0}
}
$ awk -f test.awk file1 file2
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006700","SYMLINK+="udevlinks/DATA12_01"
ACTION=="add|change", ENV{DM_NAME}=="360060e80166ef70000016ef700006701","SYMLINK+="udevlinks/DATA12_02"
...
No device found for THIS_IS_AN_EXAMPLE_OF_MISSING_DISK 666.00 GiB
Due to disk search using for(i in a) no order in which disks are read from the pool is guaranteed.

RHEL Release 5.5 (Tikanga), df --total option

I have a RHEL (Redhat Enterprise Linux) v6.5 (Santiago) server. On this server if i do a df -help there are list of options available. I am interested in the option --total
However there is an older version of RHEL (v5.5). In which there is no --total option.
My question is, I have a command like this:
df -h --total | grep total | awk 'NR==1{print$2}+NR==1{print$3}+NR==1{print$4}+NR==1{print$5}'
which gives the output as
62G
39G
21G
66%
Where
62G is Total size of the Disk
39G is Used
21G is remaining
61% Total usage %
The above command is working fine in RHEL v6.5. But fails in RHEL v5.5 since it does not have a --total option for df command.
When i run the same command on RHEL v5.5 i get the below error:
df: unrecognized option `--total'
Try `df --help' for more information.
So is there a command that can give me the output in the following way:
Total Disk Space
Used Space
Remaining Disk space
Usage %
Ex:
62G
39G
21G
66%

You'll have to do the calculation work yourself.
Something like this awk script should work.
$ cat dftotal.awk
BEGIN {
map[0] = "K"
map[1] = "M"
map[2] = "G"
map[3] = "T"
}
function fmt(val, c) {
c=0
while (val > 1024) {
c++
val = val / 1024
}
return val map[c]
}
{
for (i=2;i<5;i++) {
sum[i]+=$i
}
}
END {
print fmt(sum[2]) ORS fmt(sum[3]) ORS fmt(sum[4])
print ((sum[3] / sum[2]) * 100) "%"
}
$ df -P | awk -f dftotal.awk

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

awk transpose specific row to column and group them - linux

Related

How to make this awk script simple and use in gnuscript in loop form

Recover informations from CSV files with my awk script

Extract average time using fping

Create udev rules file from two input file

RHEL Release 5.5 (Tikanga), df --total option

Categories

Resources