Im parsing linux sar output and i have dat file which looks like this :
07:09:49 CPU %usr %nice %sys %iowait %steal %irq %soft %guest %idle
07:09:51 all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
07:09:53 all 11.82 0.00 0.13 0.00 0.00 0.00 0.00 0.00 88.05
07:09:55 all 53.99 0.00 0.63 0.00 0.13 0.00 0.13 0.00 45.12
07:09:57 all 55.18 0.00 0.25 0.00 0.00 0.00 0.00 0.00 44.57
07:09:59 all 66.58 0.00 0.51 0.00 0.00 0.00 0.13 0.00 32.78
07:10:01 all 71.90 0.00 0.63 0.13 0.00 0.00 0.13 0.00 27.22
07:10:03 all 70.24 0.00 0.63 0.00 0.00 0.00 0.13 0.00 29.00
07:10:05 all 55.39 0.00 0.63 0.00 0.00 0.00 0.13 0.00 43.85
07:10:07 all 72.90 0.00 0.38 0.00 0.00 0.00 0.00 0.00 26.73
07:10:09 all 60.96 0.00 0.38 0.00 0.13 0.00 0.13 0.00 38.40
07:10:11 all 76.60 0.00 0.63 0.00 0.00 0.00 0.13 0.00 22.65
07:10:13 all 53.87 0.00 0.76 0.00 0.00 0.00 0.13 0.00 45.25
07:10:15 all 46.73 0.00 0.63 0.00 0.00 0.00 0.00 0.00 52.64
07:10:17 all 56.37 0.00 0.50 0.00 0.00 0.00 0.13 0.00 43.00
07:10:19 all 58.15 0.00 0.63 0.00 0.00 0.00 0.13 0.00 41.09
07:10:21 all 61.26 0.00 0.75 0.00 0.00 0.00 0.13 0.00 37.86
07:10:23 all 51.50 0.00 0.75 0.12 0.12 0.00 0.25 0.00 47.2
set title ' CPU usage'
set xdata time
set timefmt '%H:%M:%S'
set xlabel 'time'
set ylabel 'CPU Usage'
set style data lines
plot 'filename.dat' using 1:3 title '0.6'
pause -1
the out put in the X data is not related to the time presented in the file
You have to set the formatting of the tic labels:
set format x '%H:%M:%S'
This question already has answers here:
Pandas concat yields ValueError: Plan shapes are not aligned
(7 answers)
Closed 6 years ago.
I am parsing data from excel files and the columns of the resulting DataFrame may or may not align to a base DataFramewhere I want to stack several parsed DataFrame.
Lets call the DataFrame I parse from data A, and the base DataFrame df_A.
I read an excel shee resulting in A=
Index AGUB AGUG MUEB MUEB SIL SIL SILB SILB
2012-01-01 00:00:00 0.00 0 0.00 50.78 0.00 0.00 0.00 0.00
2012-01-01 01:00:00 0.00 0 0.00 53.15 0.00 53.15 0.00 0.00
2012-01-01 02:00:00 0.00 0 0.00 0.00 53.15 53.15 53.15 53.15
2012-01-01 03:00:00 0.00 0 0.00 0.00 0.00 55.16 0.00 0.00
2012-01-01 04:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 05:00:00 48.96 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 06:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 07:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 08:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 09:00:00 52.28 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 10:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 11:00:00 36.93 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 12:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 13:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 50.00
2012-01-01 14:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 34.01
2012-01-01 15:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 16:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 17:00:00 53.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 18:00:00 0.00 75 0.00 75.00 0.00 75.00 0.00 0.00
2012-01-01 19:00:00 0.00 70 0.00 70.00 0.00 0.00 0.00 0.00
2012-01-01 20:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 21:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 22:00:00 0.00 0 0.00 0.00 0.00 0.00 0.00 0.00
2012-01-01 23:00:00 0.00 0 53.45 53.45 0.00 0.00 0.00 0.00
I create the base dataframe:
units = ['MUE', 'MUEB', 'SIL', 'SILB', 'AGUG', 'AGUB', 'MUEBP', 'MUELP']
df_A = pd.DataFrame(columns=units)
df_A = pd.concat([df_A, A], axis=0)
Usually with concat if A had less columns than df_A it'll be fine, but in this case the only difference in the columns is the order. the concatenation leads to the following error:
ValueError: Plan shapes are not aligned
I'd like to know how to concatenate the two dataframes with the column order given by df_A.
I've tried this and it doesn't matter whether there are more columns in the source, or target defined DataFrame - either way, the result is a dataframe that consists of a union of all supplied columns (with empty columns specified in the target, but not populated by the source populated with NaN).
Where I have been able to reproduce your error is where the column names in either the source or target dataframe include a duplicate name (or empty column names).
In your example, various columns appear more than once in your source file. I don't think concat copes very well with these kinds of duplicate columns.
import pandas as pd
s1 = [0,1,2,3,4,5]
s2 = [0,0,0,0,1,1]
A = pd.DataFrame([s2,s1],columns=['A','B','C','D','E','F'])
Resulting in:
A B C D E F
-----------
0 0 0 0 1 1
0 1 2 3 4 5
Take a subset of columns and use them to create a new dataframe called B
B = A[['A','C','E']]
A C E
-----
0 0 1
0 2 4
Create a new empty target dataframe
col_names = ['D','A','C','B']
Z = pd.DataFrame(columns=col_names)
D A C B
-------
And concatenate the two:
Z = pd.concat([B,Z],axis=0)
A C D E
0 0 NaN 1
0 2 NaN 4
Works fine!
But if I recreate the empty dataframe using columns as so:
col_names = ['D','A','C','D']
Z = pd.DataFrame(columns=col_names)
D A C D
And try to concatenate:
col_names = ['D','A','C','D']
Z = pd.DataFrame(columns=col_names)
Then I get the error you describe.
It's because of the duplicate columns in the data (SIL). See: Pandas concat gives error ValueError: Plan shapes are not aligned
So I am using a pm3d map to plot a data file with 3 columns x, y, z. The final plot shows some region in 2d and I have another data file x, y which are discrete coordinates of some of the points on the boundary of the region. I want to plot these points on top of a plot generated by pm3d map. If I simply try replot after plotting pm3d map, it doesn't show those points in the plot. Can anybody kindly tell me how can I achieve this?
Thanks in advance.
Edit: Here is the minimal example. The data file is something like this:
0.00 -0.50 4
0.00 -0.25 4
0.00 0.00 4
0.00 0.25 4
0.00 0.50 4
0.25 -0.50 1
0.25 -0.25 1
0.25 0.00 1
0.25 0.25 1
0.25 0.50 1
0.50 -0.50 0
0.50 -0.25 0
0.50 0.00 0
0.50 0.25 0
0.50 0.50 0
0.75 -0.50 0
0.75 -0.25 0
0.75 0.00 0
0.75 0.25 0
0.75 0.50 0
1.00 -0.50 3
1.00 -0.25 4
1.00 0.00 4
1.00 0.25 5
1.00 0.50 5
I am plotting this by following commands:
set pm3d map
set pm3d corners2color c1
spl 'file.dat'
I also have another file border.dat which contains discrete points like this:
0.00 -0.25
0.25 0.25
1.00 0.00
Now I want to plot the points (x and y coordinates) of the points given in this file on top of the plot that pm3d map (I am not using with pm3d; it's pm3d map!) generates for file.dat.
How can I achieve this?
Thank you
the output of iostat is like that:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 2.40 0.01 3.92 0.16 25.28 12.95 0.05 12.81 6.58 2.58
sda1 0.00 0.00 0.00 0.00 0.00 0.00 25.86 0.00 6.57 5.38 0.00
sda2 0.00 2.40 0.01 3.92 0.16 25.28 12.95 0.05 12.81 6.58 2.58
sdb 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 30.37 20.16 0.00
VG00-LogVol00
0.00 0.00 0.00 0.70 0.02 2.79 8.04 0.02 23.72 3.71 0.26
VG00-LogVol04
0.00 0.00 0.00 4.31 0.03 17.26 8.01 0.07 16.74 4.32 1.87
VG00-LogVol03
0.00 0.00 0.00 0.24 0.00 0.98 8.01 0.01 21.37 8.52 0.21
VG00-LogVol08
0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 14.03 2.31 0.00
VG00-LogVol01
0.00 0.00 0.00 0.00 0.00 0.00 8.03 0.00 127.25 1.17 0.00
VG00-LogVol07
0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 2.42 1.72 0.00
VG00-LogVol06
0.00 0.00 0.00 0.80 0.01 3.21 8.02 0.01 10.28 4.89 0.39
VG00-LogVol02
0.00 0.00 0.01 0.26 0.10 1.04 8.52 0.01 52.88 6.01 0.16
VG00-LogVol05
0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 3.73 0.33 0.00
I try to parse the output but when I reach "VG00-LogVol00", "VG00-LogVol04" I have problems parsing the text. Is there a way to remove the extra lines using sed?
Thank you
If VG00 is always present in split lines, you could do it like this:
sed '/VG00/ { N; s/\n// }'
With the copy/pasted text I have the following aligns the columns (GNU sed and BSD sed):
sed '/VG00/ { N; s/\n//; s/ \{5,\}/ /; }'
Output:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 2.40 0.01 3.92 0.16 25.28 12.95 0.05 12.81 6.58 2.58
sda1 0.00 0.00 0.00 0.00 0.00 0.00 25.86 0.00 6.57 5.38 0.00
sda2 0.00 2.40 0.01 3.92 0.16 25.28 12.95 0.05 12.81 6.58 2.58
sdb 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 30.37 20.16 0.00
VG00-LogVol00 0.00 0.00 0.00 0.70 0.02 2.79 8.04 0.02 23.72 3.71 0.26
VG00-LogVol04 0.00 0.00 0.00 4.31 0.03 17.26 8.01 0.07 16.74 4.32 1.87
VG00-LogVol03 0.00 0.00 0.00 0.24 0.00 0.98 8.01 0.01 21.37 8.52 0.21
VG00-LogVol08 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 14.03 2.31 0.00
VG00-LogVol01 0.00 0.00 0.00 0.00 0.00 0.00 8.03 0.00 127.25 1.17 0.00
VG00-LogVol07 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 2.42 1.72 0.00
VG00-LogVol06 0.00 0.00 0.00 0.80 0.01 3.21 8.02 0.01 10.28 4.89 0.39
VG00-LogVol02 0.00 0.00 0.01 0.26 0.10 1.04 8.52 0.01 52.88 6.01 0.16
VG00-LogVol05 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 3.73 0.33 0.00
I'd go for
sed -e '/^\([^[:space:]]\+\)$/{N;s/\n//;}'
assuming the device lines have no whitespace.
But I'd also look into the possibility to influence iostat and make it produce different output. I don't know the options there, though.
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.01 1.38 0.02 0.51 0.37 7.56 29.42 0.02 31.72 6.15 0.33
sda1 0.00 0.00 0.00 0.00 0.00 0.00 53.57 0.00 71.60 15.33 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 38.77 0.00 14.13 13.56 0.00
sda3 0.00 0.20 0.02 0.11 0.30 1.23 24.46 0.00 35.65 9.69 0.12
sda4 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 8.12 8.12 0.00
sda5 0.00 1.11 0.01 0.24 0.06 5.43 44.15 0.01 44.22 7.06 0.18
sda6 0.00 0.00 0.00 0.00 0.00 0.01 39.47 0.00 51.98 17.60 0.00
sda7 0.00 0.00 0.00 0.00 0.00 0.00 23.46 0.00 23.69 13.80 0.00
sda8 0.00 0.00 0.00 0.01 0.00 0.04 11.37 0.00 36.27 24.38 0.02
sda9 0.00 0.05 0.00 0.16 0.00 0.84 10.79 0.00 8.46 7.56 0.12
I've a graph that looks like this...Is it consider high at 12.5 /s ?How do i interpret if i/o is good or bad?