Grouping Data Sets Together + Legend - colors

I'm unfamiliar with the terminology of what I'm trying to do (which is making it difficult to find the solution), but I think you can figure out what I'm trying to do from the inputfile. Input file is .txt that is tab separated.
#input file begins here
21 00 0.005 12.0 0.006621 0.35365 0.16718
22 00 0.005 14.0 0.00662 0.34899 0.17206
23 00 0.005 16.0 0.006645 0.34523 0.17739
24 00 0.005 18.0 0.006696 0.33956 0.1815
25 00 0.005 20.0 0.006755 0.33477 0.18692
26 00 0.005 22.0 0.006797 0.33084 0.19178
27 00 0.005 24.0 0.006892 0.3265 0.19683
28 00 0.005 26.0 0.006965 0.32093 0.20256
29 00 0.005 28.0 0.007072 0.31631 0.20747
31 00 0.007 12.0 0.006158 0.38969 0.12999
32 00 0.007 14.0 0.006124 0.38578 0.13541
33 00 0.007 16.0 0.006136 0.38161 0.14018
34 00 0.007 18.0 0.006147 0.37697 0.1452
35 00 0.007 20.0 0.006193 0.37356 0.14999
36 00 0.007 22.0 0.006238 0.3673 0.15499
37 00 0.007 24.0 0.006276 0.36387 0.16037
38 00 0.007 26.0 0.00634 0.35855 0.16595
39 00 0.007 28.0 0.006417 0.35388 0.17118
40 00 0.007 30.0 0.006497 0.34844 0.17673
I would like to differentiate between these two blocks of data on the graph. The graph will be a 2D plot, with the top block's points in red and the bottom block's points in blue. The total inputfile is about 1000 lines long, with different vertical lengths; however, they are all appropriately separated with the newline character (\n).
I'm plotting columns 4 and 6 with the data set name (a.k.a. legend label) being column 3.

Here is how you can address the different points:
Your input file indeed consists of two data blocks, which can be selected with every for plotting: every :::0::0 selects only the first block, see the documentation or help every.
To use a red line color, just use e.g.
plot 'file.txt' linecolor rgb 'red'
To select columns 4 and 6 for plotting, use using 4:6.
Using the values of the third columns as key labels is not straightforward. If you know that it is a numerical value, than you can use the stats command to extract these single values (see e.g. Gnuplot: How to load and display single numeric value from data file):
stats 'file.txt' using 3 every :::0::0 nooutput
key1 = sprintf('%.3f', STATS_max)
stats 'file.txt' using 3 every :::1::1 nooutput
key2 = sprintf('%.3f', STATS_max)
If the column can also contain other values, or you want to maintain this same formatting, you need an external tool to extract the values for the title:
key1 = system('head -1 file.txt | cut -f 3')
key2 = system('tail -1 file.txt | cut -f 3')
So, alltoghether you script may look like the following:
stats 'file.txt' using 3 every :::0::0 nooutput
key1 = sprintf('%.3f', STATS_max)
stats 'file.txt' using 3 every :::1::1 nooutput
key2 = sprintf('%.3f', STATS_max)
plot 'file.txt' using 4:6 every :::0::0 linecolor rgb 'red' title key1,\
'' using 4:6 every :::1::1 linecolor rgb 'blue' title key2

Related

Trimming data from list to fit in a specific shape such as a geodataframe map

I have a bunch of approximated data in a list from which I create a color map. I have overlaid this map onto a map which was drawn from a geodataframe containing a shape file (with all coordinates of the polygon boundaries. In the picture below you can see that the cmap does not conform to the shape of the map.
To make things look cleaner, I would like to somehow 'trim' the edges off of the color map. How could I go about doing this? I have tried using df.totalbounds to conditionally do calculations before the colormap data is made, but this produces the results seen in the photo linked below.
Any solutions or input are appreciated, thanks!
Code snippet of the conditional calculation described above where geo_minx/y and geo_maxx/y are the min and max (x,y) values taken from the df.totalbounds method.
if geo_minx <= realx <= geo_maxx and geo_miny <= realy <= geo_maxy:
Map with color map over a geodataframe shape file
Edit
Here is the structure of the dataframe that holds each approximated data point with its (x,y) coordinates:
approximated data
X Y approx
0 -124.6 24.6 1.006655
1 -124.6 24.8 1.006655
2 -124.6 25.0 1.006655
3 -124.6 25.2 1.006655
4 -124.6 25.4 1.006655
Here is the map dataframe struture
<bound method NDFrame.head of STATEFP STATENS AFFGEOID GEOID ... LSAD ALAND AWATER geometry
0 28 01779790 0400000US28 28 ... 00 121533519481 3926919758 MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ...
1 37 01027616 0400000US37 37 ... 00 125923656064 13466071395 MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ...
2 40 01102857 0400000US40 40 ... 00 177662925723 3374587997 POLYGON ((-103.00257 36.52659, -103.00219 36.6...
3 51 01779803 0400000US51 51 ... 00 102257717110 8528531774 MULTIPOLYGON (((-75.74241 37.80835, -75.74151 ...
4 54 01779805 0400000US54 54 ... 00 62266474513 489028543 POLYGON ((-82.64320 38.16909, -82.64300 38.169.

Missing Date xticks on chart for matplotlib on Python 3. Bug?

I am following this section, I realize this code was made using Python 2 but they have xticks showing on the 'Start Date' axis and I do not. My chart only shows Start Date and no dates are provided. I have attempted to convert the object to datetime but that shows the dates and breaks the graph below it and the line is missing:
Graph
# Set as_index=False to keep the 0,1,2,... index. Then we'll take the mean of the polls on that day.
poll_df = poll_df.groupby(['Start Date'],as_index=False).mean()
# Let's go ahead and see what this looks like
poll_df.head()
Start Date Number of Observations Obama Romney Undecided Difference
0 2009-03-13 1403 44 44 12 0.00
1 2009-04-17 686 50 39 11 0.11
2 2009-05-14 1000 53 35 12 0.18
3 2009-06-12 638 48 40 12 0.08
4 2009-07-15 577 49 40 11 0.09
Great! Now plotting the Difference versus time should be straight forward.
# Plotting the difference in polls between Obama and Romney
fig = poll_df.plot('Start Date','Difference',figsize=(12,4),marker='o',linestyle='-',color='purple')
Notebook is here

plot 10 line sof 1000 values on gnuplot

I have a data file with 10 lines with 1000 values each line and I'm trying to plot this values with this script
#!/usr/bin/gnuplot -persist
plot "data.dat" using [1:1000] title "" with lines
but I get this error
plot "data.dat" using [1:1000] title "" with lines
^
"./plot.sh", line 3: invalid expression
How can I indiate a interval form the first value to the 1000 value?I't posible to set a diferent random clor to every line?
As #vaettchen pointed out, gnuplot wants data in columns and plotting rows is not straightforward. So, best would be if your data was transposed. Unfortunately, gnuplot has no function to transpose data. So, you have to use external tools to transpose your data.
Although, if your data is 10 lines with 1000 values each, i.e. a strict 10x1000 matrix, you could do something with gnuplot only (see below).
However, if your data is not a strict matrix, e.g. one line has more or less values or one value missing the method below won't work.
The following example (just 5 lines with 7 values each) illustrates plotting columns and plotting rows.
### plotting columns and rows
reset session
set colorsequence classic
$Data <<EOD
11 12 13 14 15 16 17
21 22 23 24 25 26 27
31 32 33 34 35 36 37
41 42 43 44 45 46 47
51 52 53 54 55 56 57
EOD
# get the number of rows
stats $Data u 0 nooutput
RowCount = STATS_records
# do the plot
set multiplot layout 1,2
set title "Plotting columns"
set xlabel "Row no."
set xtics 1
# plot all columns from 1 to *(=autodetection)
plot for [i=1:*] $Data u ($0+1):i w lp pt 7 not
set title "Plotting rows"
set xlabel "Column no."
# plot all rows
plot for [i=0:RowCount-1] $Data matrix u ($1+1):0 every :::i::i w lp pt 7 not
unset multiplot
### end of code
Which results in:

How set point type from data in gnuplot?

How set point type from data in gnuplot?
gnuplot script:
set terminal pngcairo size 640,480
set output "points.png"
set style data points
set auto x
set autoscale x
unset colorbox
plot 'test.data' using 2:1 with points notitle
test.data
32 35 8
34 34 6
36 28 1
34 32 2
28 30 7
38 30 9
34 29 2
35 36 9
39 34 3
31 33 9
28 31 6
35 30 5
33 41 4
32 37 3
how get point type from 3 column?
plot 'gnuplot.data' using 2:1 with points pt (:3) notitle // error
abstraction example:
need:
gnuplot Version 4.6 patchlevel 4
There is no option to select the point type from the data file based on a column (equivalent to linecolor variable, pointsize variable or arrowstyle variable). Basically you have two options:
Iterate over all possible point types (which you can extract with stats if this should be variable) and for each number plot only those points which match the current point type:
stats 'test.data' using 3 nooutput
unset key
set style data points
plot for [i=STATS_min:STATS_max] 'test.data' using 2:($3 == i ? $1 : 1/0) lt 1 pt i ps 2
Use the labels plotting style and a sequence of unicode point symbols from which you select using the value from the third column as index. (use e.g. http://www.shapecatcher.com or http://decodeunicode.org/en/geometric_shapes to find suitable symbols)
unset key
set encoding utf8
symbol(z) = "•✷+△♠□♣♥♦"[int(z):int(z)]
plot 'test.data' using 2:1:(symbol($3)) with labels textcolor lt 1

add horizontal line histogram gnuplot

I would like to add a horizontal line in my histogram in gnuplot, is that possible?
My histogram has on the x axis: alea1 alea 2 alea3 nalea1 nalea 2 nalea 3
and the y axis goes from 0 to 25.
At 22, I want to add a horizontal line that goes all the way across from one end to the other end of the histogram.
Try adding
, 22 title ""
at the end of your plot command. Works for my test data (file "histo"):
# Year Red Green Blue
1990 33 45 18
1991 35 42 19
1992 34 44 14
1993 37 43 25
1994 47 15 30
1995 41 14 32
1996 42 20 35
1997 39 21 31
plot "histo" u 2 t "Red" w histograms, "" u 3 t "Green" w histograms, "" u 4 t "Blue" w histograms, 22 title ""
(taken from Philip K. Janert, Gnuplot in Action)
The typical way to add horizontal and/or vertical lines is with an arrow
set arrow from x1,y1 to x2,y2 nohead linestyle ...
For a horizontal line, y1 and y2 will be the same. From your question, I'm a little unsure what you mean by "at 22", but I'm guessing you mean that you want to plot the line y=22 on top of your histogram. If that's the case, try this (before your plot command).
set arrow from graph 0,first 22 to graph 1,first 22 nohead lc rgb "#000000" front

Resources