Gnuplot Heatmap from multiple Files - gnuplot

My Data looks like this:
2015-08-01 07:00 0.23 0.52 0.00 0.52 9 14.6 14.6 14.6 67 8.5 0.0 --- 0.00 0.0 --- 14.6 14.1 14.1 16.3 1016.2 0.00 0.0 156 0.22 156 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.05 23 1 100.0 1 1.8797836153192153 660.7143449269239
2015-08-01 07:01 0.25 0.53 0.00 0.53 0 14.6 14.6 14.6 67 8.5 0.0 --- 0.00 0.0 --- 14.6 14.1 14.1 16.3 1016.2 0.00 0.0 153 0.22 153 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.00 23 1 100.0 1 1.8894284951616422 657.3416264126714 105 73 121 163
2015-08-01 07:02 0.25 0.52 0.00 0.52 0 14.7 14.7 14.6 67 8.6 0.0 --- 0.00 0.0 --- 14.7 14.2 14.2 16.1 1016.2 0.00 0.0 139 0.20 139 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.00 24 1 100.0 1 1.8976360559992214 654.4985251906015
2015-08-01 07:03 0.26 0.53 0.00 0.53 0 14.7 14.7 14.7 67 8.6 0.0 --- 0.00 0.0 --- 14.7 14.2 14.2 16.1 1016.3 0.00 0.0 139 0.20 144 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.00 23 1 100.0 1 1.9047561611790007 652.0519661851259
2015-08-01 07:04 0.25 0.53 0.00 0.53 0 14.7 14.7 14.7 67 8.7 0.0 --- 0.00 0.0 --- 14.7 14.2 14.2 16.2 1016.3 0.00 0.0 141 0.20 141 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.00 24 1 100.0 1 1.903537153899393 652.4695341279602
2015-08-01 07:05 0.25 0.52 0.00 0.52 0 14.8 14.8 14.7 67 8.7 0.0 --- 0.00 0.0 --- 14.8 14.3 14.3 16.3 1016.3 0.00 0.0 148 0.21 148 0.0 0.00 0.0 0.002 0.000 23.9 39 9.1 23.4 0.00 23 1 100.0 1 1.897596925383499 654.5120216976508
........
........
I've got multiple files looking that way: so I got data from 2015-08-01, 2015-06-05 and so on.
i want to plot the 43rd row in relation to the 3rd and 25th row :-) in some kind of heat map style from all those files in ONE plot. So those are the rows want to pick out of each the file:
0.23 156 660.7143449269239
0.25 153 660.7143449269239
0.25 139 654.4985251906015
0.26 139 652.0519661851259
i got the format right through dgrid 3d and that ist my output so far:
here's my code
set dgrid3d
set grid
set palette model HSV defined ( 0 0 1 1, 1 1 1 1 )
set pm3d map
unset surf
set pm3d at b
splot "data_AIT_lvl1_20150604.csv" every ::121::600 using 3:25:43 lc palette title '{/Symbol l}average 20150604',\
"data1.csv" every ::121::361 using 3:25:43 lc palette title '{/Symbol l}average 20150605',\
"data2" every ::121::361 using 3:25:43 lc palette title '{/Symbol l}average 20150606',\
"data3.csv" every ::121::361 using 3:25:43 lc palette title '{/Symbol l}average 20150703',\
and so on for multple files
I like the output but I'd like to know if there's a way to improve the overlaying areas in the plot to distinguish the values better? Is there a gnuplot way to write all the data I hwant to plot from each file into one big table and plot the data from that table into a heat map. I tried a few things but somehow lost track of all my try and error steps so I thought maybe one of you could help me out with a clean approach to this.
Thanks for the answers so far I'm trying my best to specify my second question a bit more:
right now I have the values of multiple days plotted in the graph, it looks good but there are parts overlapping so I can't see the values (hue) of all the days in the plot.
Since in my experience, I tend to overcomplicate problems like this a bit so I decided to ask the question if there's a way to solve that.
I thought maybe by putting all the days into one big table all the data is plotted on one level so I'd get a simple colored heat map.
I tried Joces table solution, which works flawlessly but Joce was right, it didn't actually solve my problem.
as you can see there's now a huge block of data, with different colors, but you can't distinguish between the different days. Alos, the gap from the first picture (between the left big purple block and the entered orange block) is gone and melted into one big block.
So I think what I'm trying to ask is if there's another better way maybe with contour to get what I want.

What you ask for is
set table
set output "one_big_table"
splot "file1" using c1:c2:c3:..., \
"file2" using C1:C2:C3:...., \
...
unset table
This will create as many blocks as you have files, so I am not sure your final goal will be so easy to achieve. That's a different issue though.

Related

Pandas: How to separate a large df into multiple dfs based on column value

I was wondering whether there is a way to seperate the table below in multiple sub dfs using the periodicity in the first column e.g between ~5,..,~0
before:
a b c
5.10 1.00 0.00
4.20 2.00 0.00
3.01 3.00 0.00
2.10 4.00 0.00
1.20 5.00 0.00
0.52 6.00 0.00
0.02 6.00 1.00
5.30 7.00 0.40
4.20 8.00 0.00
3.10 9.00 0.00
2.40 10.00 0.00
1.30 11.00 0.00
0.20 12.00 0.00
5.98 13.00 0.00
4.23 14.00 0.30
3.33 15.00 0.00
2.11 16.00 0.00
1.30 17.00 0.00
0.30 18.00 0.00
5.50 13.00 0.00
output after separating into multiple dfs :
"sub_df1"
5.10 1.00 0.00
4.20 2.00 0.00
3.01 3.00 0.00
2.10 4.00 0.00
1.20 5.00 0.00
0.52 6.00 0.00
0.02 6.00 0.00
"sub_df2"
5.30 7.00 0.00
4.20 8.00 0.00
3.10 9.00 0.00
2.40 10.00 0.00
1.30 11.00 0.00
0.20 12.00 0.00
"sub_df3"
5.98 13.00 0.00
4.23 14.00 0.00
3.33 15.00 0.00
2.11 16.00 0.00
1.30 17.00 0.00
0.30 18.00 0.00
"sub_df4"
5.50 13.00 0.00
The periodicity is variable in length so I cannot assume a fixed length to separate. Therefore, I thought first to add another column 'id' like
df['id']=(df['a'].shift(1)>df['a']).astype(int)
this could show me at least from where (1st:"0") to where (2nd"0") to append the values. However, I don't quite know how to continue from here
a b c id
0 4.20 2.0 0.0 0
1 3.01 3.0 0.0 1
2 2.10 4.0 0.0 1
3 1.20 5.0 0.0 1
4 0.52 6.0 0.0 1
5 0.02 6.0 1.0 1
6 5.30 7.0 0.4 0
7 4.20 8.0 0.0 1
8 3.10 9.0 0.0 1
9 2.40 10.0 0.0 1
10 1.30 11.0 0.0 1
11 0.20 12.0 0.0 1
12 5.98 13.0 0.0 0
13 4.23 14.0 0.3 1
14 3.33 15.0 0.0 1
15 2.11 16.0 0.0 1
16 1.30 17.0 0.0 1
17 0.30 18.0 0.0 1
18 5.50 13.0 0.0 0
You can create a series s to identify the different groups. From there, you can create multiple dataframes and add the to a dictionary of dataframes df_dict. I show oyu how to access these in the print statement.:
s = (df['a'] > df['a'].shift()).cumsum() + 1
df_dict = {}
for frame, data in df.groupby(s):
df_dict[f'df{frame}'] = data
print(df_dict['df1'], '\n\n',
df_dict['df2'], '\n\n',
df_dict['df3'], '\n\n',
df_dict['df4'])
a b c
0 5.10 1.0 0.0
1 4.20 2.0 0.0
2 3.01 3.0 0.0
3 2.10 4.0 0.0
4 1.20 5.0 0.0
5 0.52 6.0 0.0
6 0.02 6.0 1.0
a b c
7 5.3 7.0 0.4
8 4.2 8.0 0.0
9 3.1 9.0 0.0
10 2.4 10.0 0.0
11 1.3 11.0 0.0
12 0.2 12.0 0.0
a b c
13 5.98 13.0 0.0
14 4.23 14.0 0.3
15 3.33 15.0 0.0
16 2.11 16.0 0.0
17 1.30 17.0 0.0
18 0.30 18.0 0.0
a b c
19 5.5 13.0 0.0
Try this:
listofdfs = [y for x,y in df.groupby(df['a'].diff().gt(0).cumsum())]
or
dict(list(df.groupby(df['a'].diff().gt(0).cumsum())))

Support Vector Method

I have the following dataset as a small part of the big dataset.
PM2.5 is the dependent variable, while the other seven-column
represent the independent variables, AOD, BLH, RH, WS, Prec. and Temp.
I am looking to use the Support Vector Method SVM multiple regression
to find the best fit multiple variable regression equation using the python code.
I will appreciate your help a lot.
PM2.5 AOD BLH RH WS Prec Temp SLP
43.52 0.42 0.39 0.74 1.2 0.4 4.95 1.03
18.4 0.31 0.41 0.71 2.9 0.0 13.4 1.02
53.36 0.30 0.91 0.75 3.21 2.8 17.2 1.01
18.83 0.36 0.29 0.48 1.7 0.6 20.5 1.02
21.2 0.39 0.36 0.52 0.93 0.1 22.0 1.02
12.17 0.15 0.69 0.52 0.55 0.1 18.67 1.01
8.75 0.11 0.42 0.59 4.98 0.1 18.67 1.01
7.7 0.31 0.048 0.52 0.95 0.0 22.44 1.02
6.58 0.05 0.48 0.57 2.75 0.0 32.38 1.02
Data as an xls file is here
Thanks a lot in advance

Gnuplot pm3d: 'NaN value' removes all surrounding rectangles

I would like to plot a pm3d map, where data points are not equidistant on the axis.
Since the spacings for the x and y axis are identical, it is symmetrical, though.
The problem is whenever a value is "NaN", all of the four surrounding rectangles
are not plotted. In the data file below, this happens, for example, at (x,y)=(0.14,0.33) .
If the value is not 'NaN', then the four rectangles reappear.
I discovered this problem, when I tried to plot only the values >0 or <0, where the same happens.
I tried to search the documentation and the internet, but couldn't find anything on this.
Are there any solutions to this?
Plotscript:
set view map
set pm3d at b
set style data pm3d
set pm3d corners2color c1
set size ratio 1
set autoscale fix
set cbrange [-25:25]
set palette defined (-25 "blue", 0 "white", 25 "red")
set term png
set output "test.png"
splot "data.txt" u 1:2:3 notitle
set output
Data file:
0.0 0.0 1
0.0 0.08 -2
0.0 0.14 3
0.0 0.33 -4
0.0 0.46 5
0.0 0.55 5
0.08 0.0 -6
0.08 0.08 7
0.08 0.14 -8
0.08 0.33 9
0.08 0.46 -10
0.08 0.55 -10
0.14 0.0 11
0.14 0.08 -12
0.14 0.14 13
0.14 0.33 NaN
0.14 0.46 15
0.14 0.55 15
0.33 0.0 -16
0.33 0.08 17
0.33 0.14 -18
0.33 0.33 19
0.33 0.46 -20
0.33 0.55 -20
0.46 0.0 21
0.46 0.08 -22
0.46 0.14 23
0.46 0.33 -24
0.46 0.46 25
0.46 0.55 25
0.55 0.0 21
0.55 0.08 -22
0.55 0.14 23
0.55 0.33 -24
0.55 0.46 25
0.55 0.55 25
Thanks to the comment by #theozh I figured out a solution to this problem.
I adopted the script by #theozh under Plotting Heatmap with different column/line widths to the form below. This yields for the file
1 -6 11 -16 21
-2 7 -12 17 -22
3 -8 13 -18 23
-4 9 NaN 19 -24
5 -10 15 -20 25
this plot.
This is the best solution, because the data has this format anyway and the coordinates are a different file that I read in.
Plotscript:
CoordsX = "0.04 0.11 0.24 0.40 0.51"
CoordsY = "0.04 0.11 0.24 0.40 0.51"
dimX = words(CoordsX)
dimY = words(CoordsY)
dx(i) = (word(CoordsX,i)-word(CoordsX,i-1))*0.5
dy(i) = (word(CoordsY,i)-word(CoordsY,i-1))*0.5
ndx(i,j) = word(CoordsX,i) - (i-1<1 ? dx(i+1) : dx(i))
pdx(i,j) = word(CoordsX,i) + (i+1>ColCount ? dx(i) : dx(i+1))
ndy(i,j) = word(CoordsY,j) - (j-1<1 ? dy(j+1) : dy(j))
pdy(i,j) = word(CoordsY,j) + (j+1>RowCount ? dy(j) : dy(j+1))
set xrange[ndx(1,1):pdx(ColCount,1)]
set yrange[ndy(1,1):pdy(1,RowCount)]
set tic out
max = 25
set cbrange [-max:max]
set palette defined (-max "blue", 0 "white", max "red")
set term png
set output "test.png"
plot for [i=1:dim_x] file u (real(word(CoordsX,i))):1:(ndx(i,int($0))):(pdx(i,int($0))):(ndy(i,int($0+1))):(pdy(i,int($0+1))):i with boxxyerror fs solid 1.0 palette notitle
set output
### end of code

gnuplot draw adjcent 3D cubic with different color

I am studing container loading algorithm. When I have loading plan, I use gnuplot to plot the plan (3D) as in attachment. As all goods are cubic, I want to plot one cubic border line by yellow, next brown, then yellow, next brown. Of course, the color could be any. My purpose is that I could see better the cubic loading plan. Currently, I could only plot with same color.
The better is that Container cubic border line is its own.
Part of my test data is at /2/
/2/
++++++container 40 feet data###########
0 0 0
12.0 0 0
12.0 2.3 0
0 2.3 0
0 0 0
0 0 0
0 0 2.5
12.0 0 2.5
12.0 2.3 2.5
0 2.3 2.5
### container 40 feet data#########
##########first cubic #############
0 0 2.5
0.0 0.0 0.0
0.64 0.0 0.0
0.64 0.66 0.0
0.0 0.66 0.0
0.0 0.0 0.0
0.0 0.0 1.93
0.64 0.0 1.93
0.64 0.66 1.93
0.0 0.66 1.93
0.0 0.0 1.93
0.64 0.0 0.0
0.64 0.0 1.93
0.64 0.66 0.0
0.64 0.66 1.93
0.0 0.66 0.0
0.0 0.66 1.93
################# Second cubic#################
0.64 0.0 0.0
1.27 0.0 0.0
1.27 0.66 0.0
0.64 0.66 0.0
0.64 0.0 0.0
0.64 0.0 1.93
1.27 0.0 1.93
1.27 0.66 1.93
0.64 0.66 1.93
0.64 0.0 1.93
1.27 0.0 0.0
1.27 0.0 1.93
1.27 0.66 0.0
1.27 0.66 1.93
0.64 0.66 0.0
0.64 0.66 1.93

Fitting Function with Multiple Files

My Data:
File 1:
2015-08-01 07:00 0.23 0.52 0.00 0.52 9 14.6 14.6 14.6 67 8.5 0.0 --- 0.00 0.0 --- 14.6 14.1 14.1 16.3 1016.2 0.00 0.0 156 0.22 156 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.05 23 1 100.0 1 1.8797836153192153 660.7143449269239
File 2:
2015-08-01 07:00 0.23 0.52 0.00 0.52 9 14.6 14.6 14.6 67 8.5 0.0 --- 0.00 0.0 --- 14.6 14.1 14.1 16.3 1016.2 0.00 0.0 156 0.22 156 0.0 0.00 0.0 0.003 0.000 23.9 39 9.1 23.4 0.05 23 1 100.0 1 1.8797836153192153 660.7143449269239
..... and so on.
So the csv. files are multiple days and from those days I created a scatterplot using 3:43:0
I used the 0as a dummy so I could use varialble linecolors (if I wouldn`t have done that the colors would have repeated themselfs after line 9)
The Scatter Plot looks great but now I want to fit a curve into the plot. There are 2 similar questions: Question 1 , Question covering the Fit Data from multiple files but when trying the cat or awk command I always end up with an error telling me cannot create pipe for data
So what I tried was:
fit f(x) '< cat file1.csv file2.csv file3.csv file4.csv file5.csv' u 3:43:0 a,b
am I missing something here?
Cou8ld this be a OS Problem? I run Windows 7.
Both cat and awk are Unix-commands. The windows-equivalent of cat is type. For instance, the following should work:
fit f(x) '< type file1.csv file2.csv' u 3:43:0 via a,b
If, for some reason, you need to use a tool like gawk (gnu-equivalent to awk), grep, or sed in windows, take a look at gnuwin32.

Resources